For years, scientists have struggled to decipher the genetic instruction book that details where and when the 20,000 genes in a human cell will be turned on or off. Different genes operate in each cell type at different times, and this careful orchestration is what ultimately distinguishes a brain cell from a liver or skin cell.
Now, scientists at Washington University School of Medicine in St. Louis report they have developed a model of gene expression in yeast that predicts with a high degree of accuracy whether a gene will be switched on or off. The study is now available in the advance online publication of Nature.
"A huge part of the magic in biology happens at the level of a cell deciding whether it is going to transcribe a gene or not," says senior author Barak Cohen, Ph.D., assistant professor of genetics. "We have found that just a few simple rules may underlie the complicated gene expression patterns that determine whether a particular gene will be expressed at high levels in one tissue and low levels in another tissue."
Since the discovery of DNA's double helical structure more than a half century ago, scientists have focused much of their attention on understanding the 2 percent of the genome that is made up of classic genes, which code for the production of proteins.
However, the instructions for turning these genes on or off are generally not in the genes themselves. Rather, they are buried in the 98 percent of the genome that was once cast aside as little more than genetic "junk."
"In theory, we should be able to read those instructions," Cohen explains. "A cell can look at a piece of DNA and know where and when to express a particular gene. But the fundamental question we looked at starts with the premise that scientists can't do that at all."
Researchers have known for some time that the instructions for controlling gene expression lie in short DNA sequences, called promoters, embedded in long, rambling stretches of DNA at the front of most genes. Proteins known as transcription factors bind to promoters to either activate a particular gene or shut down its activity.
But in reality, the regulation of a gene is far more complex. Each promoter can simultaneously bind a number of transcription factors, some of which work to jump start a gene's activity and others to shut it down. Additionally, the binding sites in the promoter region tolerate DNA base substitutions, so the same transcription factor can bind with slightly different affinities, depending on the promoter's genetic sequence. Transcription factors then combine to create a net effect on gene activity that is far greater or less than expected, making it exceedingly difficult to quantify their influence on a gene.
The model Cohen and his colleagues developed boils down to a few simple rules the interactions of transcription factors with DNA and with each other. The model only takes into account how tightly transcription factor proteins bind to DNA in the promoter region and how tightly transcription factors bind to each other. These simple rules can explain most of the variation in gene expression between different promoters.
They then created 2,800 simple artificial promoters and asked if these rules were sufficient to understand the activity of these promoters. "Because if we can't boil it down and understand these complex interactions on simple artificial promoters, then there's no hope for understanding real promoters," Cohen says.
The scientists constructed promoters that consisted of random combinations of three or four transcription factor binding sites, or building blocks, using a total of 18 different building blocks. They then recorded the DNA sequence of each promoter, along with its corresponding gene expression. By incorporating sophisticated mathematical equations and statistical analysis, they could eventually predict, given a particular promoter sequence, whether it would activate or suppress gene activity.
The team determined that 65 percent of the complex variation in gene expression from one cell to the next could be explained by the simple rules that focus on the binding affinity of transcription factors.
When the investigators tested their model on real promoters in the genome of yeast, they confirmed that it could accurately predict how the binding site for the transcription factor Mig1 dampens gene expression. The model identified all 40 genes already known to be regulated by Mig1. But additionally, by incorporating information from weak binding sites that other models have not taken into account, they also uncovered another eight genes not previously known to be influenced by Mig1.
"That our model can incorporate information from weak binding sites is really important because gene expression can be influenced by very subtle interactions in the promoter regions," Cohen says. "No other model has been able to account for these subtle interactions."
Some scientists have suggested that biochemical processes, including enzymatic reactions, are more important than binding affinity, but Cohen says his model disputes that assertion.
"Our model answers the question: How do cells read the instructions for gene expression," Cohen says. "They are read mostly by the simple binding of transcription factors to DNA. This binding either recruits the enzyme RNA polymerase, which begins the process of copying and transferring information stored in the genes, or blocks it."
While Cohen is still perfecting the model, he says it may eventually enable scientists to determine where and when all the genes in the human genome will be expressed just by looking at the genetic code in the promoter region.
In addition, the model may help researchers engineer artificial promoters that drive embryonic stem cells toward a particular fate or that will turn on a gene in only a particular cell type. If scientists know the code that turns on a particular gene, then they could conceivably design an artificial promoter as a potential treatment for disease, Cohen says.
He and his group are now designing more complicated synthetic promoters similar to those that occur naturally in yeast and higher organisms to determine whether they can continue to accurately predict variations in gene expression.
The study was funded by the National Institutes of Health and the National Science Foundation.
Gertz J, Siggia ED, Cohen BA. Analysis of Combinatorial c-Regulation in Synthetic and Genomic Promoters. Nature, online Nov. 23, 2008.
Washington University School of Medicine's 2,100 employed and volunteer faculty physicians also are the medical staff of Barnes-Jewish and St. Louis Children's hospitals. The School of Medicine is one of the leading medical research, teaching and patient care institutions in the nation, currently ranked third in the nation by U.S. News & World Report. Through its affiliations with Barnes-Jewish and St. Louis Children's hospitals, the School of Medicine is linked to BJC HealthCare.
Caroline Arbanas | Newswise Science News
How brains surrender to sleep
23.06.2017 | IMP - Forschungsinstitut für Molekulare Pathologie GmbH
A new technique isolates neuronal activity during memory consolidation
22.06.2017 | Spanish National Research Council (CSIC)
An international team of scientists has proposed a new multi-disciplinary approach in which an array of new technologies will allow us to map biodiversity and the risks that wildlife is facing at the scale of whole landscapes. The findings are published in Nature Ecology and Evolution. This international research is led by the Kunming Institute of Zoology from China, University of East Anglia, University of Leicester and the Leibniz Institute for Zoo and Wildlife Research.
Using a combination of satellite and ground data, the team proposes that it is now possible to map biodiversity with an accuracy that has not been previously...
Heatwaves in the Arctic, longer periods of vegetation in Europe, severe floods in West Africa – starting in 2021, scientists want to explore the emissions of the greenhouse gas methane with the German-French satellite MERLIN. This is made possible by a new robust laser system of the Fraunhofer Institute for Laser Technology ILT in Aachen, which achieves unprecedented measurement accuracy.
Methane is primarily the result of the decomposition of organic matter. The gas has a 25 times greater warming potential than carbon dioxide, but is not as...
Hydrogen is regarded as the energy source of the future: It is produced with solar power and can be used to generate heat and electricity in fuel cells. Empa researchers have now succeeded in decoding the movement of hydrogen ions in crystals – a key step towards more efficient energy conversion in the hydrogen industry of tomorrow.
As charge carriers, electrons and ions play the leading role in electrochemical energy storage devices and converters such as batteries and fuel cells. Proton...
Scientists from the Excellence Cluster Universe at the Ludwig-Maximilians-Universität Munich have establised "Cosmowebportal", a unique data centre for cosmological simulations located at the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences. The complete results of a series of large hydrodynamical cosmological simulations are available, with data volumes typically exceeding several hundred terabytes. Scientists worldwide can interactively explore these complex simulations via a web interface and directly access the results.
With current telescopes, scientists can observe our Universe’s galaxies and galaxy clusters and their distribution along an invisible cosmic web. From the...
Temperature measurements possible even on the smallest scale / Molecular ruby for use in material sciences, biology, and medicine
Chemists at Johannes Gutenberg University Mainz (JGU) in cooperation with researchers of the German Federal Institute for Materials Research and Testing (BAM)...
19.06.2017 | Event News
13.06.2017 | Event News
13.06.2017 | Event News
23.06.2017 | Physics and Astronomy
23.06.2017 | Physics and Astronomy
23.06.2017 | Information Technology