Gene finding remains an important problem in biology as scientists are still far from fully mapping the set of human genes. Furthermore, gene maps for other vertebrates, including important model organisms such as mouse, are much more incomplete than the human annotation. The new technique, known as CONTRAST (CONditionally TRAined Search for Transcripts), works by comparing a genome of interest to the genomes of several related species.
CONTRAST exploits the fact that the functional role protein-coding genes play a specific part within a cell and are therefore subjected to characteristic evolutionary pressures. For example, mutations that alter an important part of a protein's structure are likely to be deleterious and thus selected against. On the other hand, mutations that preserve a protein's amino acid sequence are normally well tolerated. Thus, protein-coding genes can be identified by searching a genome for regions that show evidence such patterns of selection. However, learning to recognize such patterns when more than two species are compared has proved difficult.
Previous systems for gene prediction were able to effectively make use of one additional 'informant' genome. For example, when searching for human genes, taking into account information from the mouse genome led to a substantial increase in accuracy. But, no system was able to leverage additional informant genomes to improve upon state-of-the-art performance using mouse alone, although it was expected that adding informants would make patterns of selection clearer. CONTRAST solves this problem by learning to recognize the signature of protein-coding gene selection in a fundamentally different way from previous approaches. Instead of constructing a model of sequence evolution, CONTRAST directly 'learns' which features of a genomic alignment are most useful for recognizing genes. This approach leads to overall higher levels of accuracy and is able to extract useful information from several informant sequences.
In a test on the human genome, CONTRAST exactly predicted the full structure of 59% of the genes in the test set, compared with the previous best result of 36%. Its exact exon sensitivity of 93%, compared with a previous best of 84%, translates into many thousands of exons correctly predicted by CONTRAST but missed by previous methods. Importantly, CONTRAST's accuracy using a combination of eleven informant genomes was significantly higher than its accuracy using any single informant. The substantial advance in predictive accuracy represented by CONTRAST will further efforts to complete protein-coding gene maps for human and other organisms.
Further information about existing gene-prediction methods and the advance CONTRAST brings to the field can be found in a minireview by Paul Flicek, which accompanies the article by Batzoglou and colleagues.
New insights into the information processing of motor neurons
22.02.2017 | Max Planck Florida Institute for Neuroscience
Wintering ducks connect isolated wetlands by dispersing plant seeds
22.02.2017 | Utrecht University
In the field of nanoscience, an international team of physicists with participants from Konstanz has achieved a breakthrough in understanding heat transport
Cells need to repair damaged DNA in our genes to prevent the development of cancer and other diseases. Our cells therefore activate and send “repair-proteins”...
The Fraunhofer IWS Dresden and Technische Universität Dresden inaugurated their jointly operated Center for Additive Manufacturing Dresden (AMCD) with a festive ceremony on February 7, 2017. Scientists from various disciplines perform research on materials, additive manufacturing processes and innovative technologies, which build up components in a layer by layer process. This technology opens up new horizons for component design and combinations of functions. For example during fabrication, electrical conductors and sensors are already able to be additively manufactured into components. They provide information about stress conditions of a product during operation.
The 3D-printing technology, or additive manufacturing as it is often called, has long made the step out of scientific research laboratories into industrial...
Nature does amazing things with limited design materials. Grass, for example, can support its own weight, resist strong wind loads, and recover after being...
Nanometer-scale magnetic perforated grids could create new possibilities for computing. Together with international colleagues, scientists from the Helmholtz Zentrum Dresden-Rossendorf (HZDR) have shown how a cobalt grid can be reliably programmed at room temperature. In addition they discovered that for every hole ("antidot") three magnetic states can be configured. The results have been published in the journal "Scientific Reports".
Physicist Dr. Rantej Bali from the HZDR, together with scientists from Singapore and Australia, designed a special grid structure in a thin layer of cobalt in...
13.02.2017 | Event News
10.02.2017 | Event News
09.02.2017 | Event News
22.02.2017 | Power and Electrical Engineering
22.02.2017 | Life Sciences
22.02.2017 | Innovative Products