Gene finding remains an important problem in biology as scientists are still far from fully mapping the set of human genes. Furthermore, gene maps for other vertebrates, including important model organisms such as mouse, are much more incomplete than the human annotation. The new technique, known as CONTRAST (CONditionally TRAined Search for Transcripts), works by comparing a genome of interest to the genomes of several related species.
CONTRAST exploits the fact that the functional role protein-coding genes play a specific part within a cell and are therefore subjected to characteristic evolutionary pressures. For example, mutations that alter an important part of a protein's structure are likely to be deleterious and thus selected against. On the other hand, mutations that preserve a protein's amino acid sequence are normally well tolerated. Thus, protein-coding genes can be identified by searching a genome for regions that show evidence such patterns of selection. However, learning to recognize such patterns when more than two species are compared has proved difficult.
Previous systems for gene prediction were able to effectively make use of one additional 'informant' genome. For example, when searching for human genes, taking into account information from the mouse genome led to a substantial increase in accuracy. But, no system was able to leverage additional informant genomes to improve upon state-of-the-art performance using mouse alone, although it was expected that adding informants would make patterns of selection clearer. CONTRAST solves this problem by learning to recognize the signature of protein-coding gene selection in a fundamentally different way from previous approaches. Instead of constructing a model of sequence evolution, CONTRAST directly 'learns' which features of a genomic alignment are most useful for recognizing genes. This approach leads to overall higher levels of accuracy and is able to extract useful information from several informant sequences.
In a test on the human genome, CONTRAST exactly predicted the full structure of 59% of the genes in the test set, compared with the previous best result of 36%. Its exact exon sensitivity of 93%, compared with a previous best of 84%, translates into many thousands of exons correctly predicted by CONTRAST but missed by previous methods. Importantly, CONTRAST's accuracy using a combination of eleven informant genomes was significantly higher than its accuracy using any single informant. The substantial advance in predictive accuracy represented by CONTRAST will further efforts to complete protein-coding gene maps for human and other organisms.
Further information about existing gene-prediction methods and the advance CONTRAST brings to the field can be found in a minireview by Paul Flicek, which accompanies the article by Batzoglou and colleagues.
Oestrogen regulates pathological changes of bones via bone lining cells
28.07.2017 | Veterinärmedizinische Universität Wien
Programming cells with computer-like logic
27.07.2017 | Wyss Institute for Biologically Inspired Engineering at Harvard
Spectrally narrow x-ray pulses may be “sharpened” by purely mechanical means. This sounds surprisingly, but a team of theoretical and experimental physicists developed and realized such a method. It is based on fast motions, precisely synchronized with the pulses, of a target interacting with the x-ray light. Thereby, photons are redistributed within the x-ray pulse to the desired spectral region.
A team of theoretical physicists from the MPI for Nuclear Physics (MPIK) in Heidelberg has developed a novel method to intensify the spectrally broad x-ray...
Physicists working with researcher Oriol Romero-Isart devised a new simple scheme to theoretically generate arbitrarily short and focused electromagnetic fields. This new tool could be used for precise sensing and in microscopy.
Microwaves, heat radiation, light and X-radiation are examples for electromagnetic waves. Many applications require to focus the electromagnetic fields to...
Strong light-matter coupling in these semiconducting tubes may hold the key to electrically pumped lasers
Light-matter quasi-particles can be generated electrically in semiconducting carbon nanotubes. Material scientists and physicists from Heidelberg University...
Fraunhofer IPA has developed a proximity sensor made from silicone and carbon nanotubes (CNT) which detects objects and determines their position. The materials and printing process used mean that the sensor is extremely flexible, economical and can be used for large surfaces. Industry and research partners can use and further develop this innovation straight away.
At first glance, the proximity sensor appears to be nothing special: a thin, elastic layer of silicone onto which black square surfaces are printed, but these...
3-D shape acquisition using water displacement as the shape sensor for the reconstruction of complex objects
A global team of computer scientists and engineers have developed an innovative technique that more completely reconstructs challenging 3D objects. An ancient...
26.07.2017 | Event News
21.07.2017 | Event News
19.07.2017 | Event News
28.07.2017 | Health and Medicine
28.07.2017 | Power and Electrical Engineering
28.07.2017 | Life Sciences