Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Modification of program enables predicition of gene transcription

13.07.2005


A modification to an "ace" gene prediction program now enables scientists to predict the very beginnings of gene transcription start sites and where the first splice occurs thereby defining the first exon of the gene.



The modification to the gene prediction software TWINSCAN is called N-SCAN. Michael Brent, Ph.D. professor of computer science at Washington University in St. Louis, together with Samuel S. Gross, then an undergraduate at Washington University, and Randall H. Brown, Ph.D., a research scientist, report their results in the May 2005 issue of Genome Research. N-SCAN has proven to be the best program available at finding both the transcription start site (TSS) and the complete first exon in both the human and fruit fly genomes.

The addition of N-SCAN to TWINSCAN now provides genomics researchers the wherewithal to find and predict both the protein sequences produced by genes and their untranslated regions. Researchers in recent years have grown increasingly enthusiastic about the significance of untranslated regions. By understanding the functions of these regions, scientists expect to understand more about gene regulation — how genes get turned on and off, the ignition system of our DNA, if you will.


To make the proteins that are the basic micro-machines of life, a region of the genome is copied, or "transcribed," to form a molecule called messenger RNA (mRNA). Some segments of the mRNA are then discarded, and the retained segments are spliced together. Geneticists have traditionally assumed that transcription starts within a few hundred bases of the protein-coding region. However, for nearly 40 percent of known human genes, transcription starts long before the beginning of the protein-coding region. Most of this extra-long untranslated region is then discarded by splicing the 5’ untranslated region (UTR). All present gene finding systems — except for N-SCAN — either ignore the UTR splice sites or incorrectly incorporate them into some protein-coding segment, making gene prediction a none-too-sure industry.

"We’ve found that when we add the spliced untranslated regions to our system, we not only get good predictions for UTRs but also improved predictions of the protein-coding region of the gene. By correctly identifying UTRs, we can avoid labeling them incorrectly as part of the protein-coding region," said Brent, who, with various colleagues, developed both TWINSCAN and N-SCAN. "It’s important to know these two areas. Some of the signals that regulate transcription reside right near the transcription site. There is a huge amount of biology to be discovered there, and the appreciation of this area is growing daily."

While genomics researchers 15 years ago paid little attention to parts of the genome outside the coding regions, they have discovered some strange functions in UTR that have provoked second and third thoughts.

For instance, it recently was discovered that huntingtin, a gene associated with Huntington’s disease, has a second protein segment encoded upstream of the main one. This protein in the so-called untranslated region is involved in regulating the gene.

Running the modified TWINSCAN, on both the human and fruit fly genomes, Brent and colleagues predicted about 25,000 transcription-start sites, compared with a known 6,000.

"In the human genome, we found many extra exons on genes that were already known, or in some cases, spliced UTRs on genes that weren’t even known to exist before," Brent said.

The system takes advantage of the scarcity of the CG sequence, finding so-called CpG "islands" known to be more common near the transcription-start site. It also has a knack for recognizing sequences that indicate splice sites.

Over the past two years, TWINSCAN has been finding and predicting genes in numerous genomes that other gene prediction systems have missed. The addition of N-SCAN to the handy system — it scans two genomes simultaneously, with potential to scan three or more — strengthens it for predicting both coding and non-coding DNA.

"Like any multiple choice question, if you can learn something about one of the choices, it helps you with the other one," Brent said. "By making this integrated model that looks for both kinds of exons in both parts of the gene, we’re able to convert a blind guessing game to a multiple choice question - is it a UTR exon or a protein-coding exon? These kinds of questions are easier to answer now."

Tony Fitzpatrick | EurekAlert!
Further information:
http://www.wustl.edu

More articles from Life Sciences:

nachricht Study shines light on brain cells that coordinate movement
26.06.2017 | University of Washington Health Sciences/UW Medicine

nachricht New insight into a central biological dogma on ion transport
26.06.2017 | Aarhus University

All articles from Life Sciences >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Can we see monkeys from space? Emerging technologies to map biodiversity

An international team of scientists has proposed a new multi-disciplinary approach in which an array of new technologies will allow us to map biodiversity and the risks that wildlife is facing at the scale of whole landscapes. The findings are published in Nature Ecology and Evolution. This international research is led by the Kunming Institute of Zoology from China, University of East Anglia, University of Leicester and the Leibniz Institute for Zoo and Wildlife Research.

Using a combination of satellite and ground data, the team proposes that it is now possible to map biodiversity with an accuracy that has not been previously...

Im Focus: Climate satellite: Tracking methane with robust laser technology

Heatwaves in the Arctic, longer periods of vegetation in Europe, severe floods in West Africa – starting in 2021, scientists want to explore the emissions of the greenhouse gas methane with the German-French satellite MERLIN. This is made possible by a new robust laser system of the Fraunhofer Institute for Laser Technology ILT in Aachen, which achieves unprecedented measurement accuracy.

Methane is primarily the result of the decomposition of organic matter. The gas has a 25 times greater warming potential than carbon dioxide, but is not as...

Im Focus: How protons move through a fuel cell

Hydrogen is regarded as the energy source of the future: It is produced with solar power and can be used to generate heat and electricity in fuel cells. Empa researchers have now succeeded in decoding the movement of hydrogen ions in crystals – a key step towards more efficient energy conversion in the hydrogen industry of tomorrow.

As charge carriers, electrons and ions play the leading role in electrochemical energy storage devices and converters such as batteries and fuel cells. Proton...

Im Focus: A unique data centre for cosmological simulations

Scientists from the Excellence Cluster Universe at the Ludwig-Maximilians-Universität Munich have establised "Cosmowebportal", a unique data centre for cosmological simulations located at the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences. The complete results of a series of large hydrodynamical cosmological simulations are available, with data volumes typically exceeding several hundred terabytes. Scientists worldwide can interactively explore these complex simulations via a web interface and directly access the results.

With current telescopes, scientists can observe our Universe’s galaxies and galaxy clusters and their distribution along an invisible cosmic web. From the...

Im Focus: Scientists develop molecular thermometer for contactless measurement using infrared light

Temperature measurements possible even on the smallest scale / Molecular ruby for use in material sciences, biology, and medicine

Chemists at Johannes Gutenberg University Mainz (JGU) in cooperation with researchers of the German Federal Institute for Materials Research and Testing (BAM)...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

Event News

Plants are networkers

19.06.2017 | Event News

Digital Survival Training for Executives

13.06.2017 | Event News

Global Learning Council Summit 2017

13.06.2017 | Event News

 
Latest News

Study shines light on brain cells that coordinate movement

26.06.2017 | Life Sciences

Smooth propagation of spin waves using gold

26.06.2017 | Physics and Astronomy

Switchable DNA mini-machines store information

26.06.2017 | Information Technology

VideoLinks
B2B-VideoLinks
More VideoLinks >>>