Big data: It’s a term we read and hear about often, but is hard to grasp. Computer scientists at Washington University in St. Louis’ School of Engineering & Applied Science tackled some big data about an important protein and discovered its connection in human history as well as clues about its role in complex neurological diseases.
Through a novel method of analyzing these big data, Sharlee Climer, PhD, research assistant professor in computer science, and Weixiong Zhang, PhD, professor of computer science and of genetics at the School of Medicine, discovered a region encompassing the gephyrin gene on chromosome 14 that underwent rapid evolution after splitting in two completely opposite directions thousands of years ago. Those opposite directions, known as yin and yang, are still strongly evident across different populations of people around the world today.
Climer/Zhang, Washington University in St. Louis
Yin-yang haplotypes arise when a stretch of DNA evolves to present two divergent forms. A group of engineers at Washington University in St. Louis showed a massive yin-yang haplotype pair encompassing the gene gephyrin on human chromosome 14. This image shows the states for markers in the region for 934 individuals in eight global populations. Dark blue and red horizontal lines in the yin-yang region represent carriers of two yin and two yang haplotypes, respectively, and light blue represents carriers of both haplotypes.
The results of their research, done with Alan Templeton, PhD, the Charles Rebstock professor emeritus in the Department of Biology in the College of Arts & Sciences, appear in the March 27 issue of Nature Communications.
The gephyrin protein is a master regulator of receptors in the brain that transmit messages. Malfunction of the protein has been associated with epilepsy, Alzheimer’s disease, schizophrenia and other neurological diseases. Additionally, without gephyrin, our bodies are unable to synthesize an essential trace nutrient.
The research team used big data from the International HapMap Project, a public resource of genetic data from populations worldwide designed help researchers find genes associated with human disease, as well as from the 1000 Genomes project, another public data source of sequenced human genomes. In total, they looked at the genetic data from 3,438 individuals.
When they analyzed the data, they made an interesting discovery in a sequence of markers, called a haplotype, enveloping the gephyrin gene: up to 80 percent of the haplotypes were perfect yin and yang types, or complete opposites of the other. They were able to trace the split back to what is known as the Ancestral haplotype, or that of the most recent common human ancestor.
“We observed that the Ancestral haplotype split into two distinct haplotypes and subsequently underwent rapid evolution, as each haplotype possesses about 140 markers that are different from the Ancestral haplotype,” Climer says. “These numerous mutations should have produced a large number of intermediate haplotypes, but the intermediates have almost entirely disappeared, and the divergent yin and yang haplotypes are prevalent in populations representing every major human ancestry.”
Using the data from the HapMap Project, they looked at the gephyrin region in several populations of people, including European, East and South Asian, and African heritage, and found variations in the haplotype frequencies of each of these populations. Those from African origin generally have more yang haplotypes, while those of European origin have more yin haplotypes. Those of Asian descent have nearly equal numbers of yin and yang haplotypes.
Humans carry pairs of chromosomes, and 30 percent of Japanese individuals carry two yin haplotypes or two yang haplotypes. Another 30 percent of these individuals possess both a yin and a yang haplotype, reflecting the roughly equal probability of inheriting either one.
To find this pattern within the huge datasets, the research team used a novel method to assess correlations between genetic markers called single nucleotide polymorphisms, or SNPs, which are variations in a DNA sequence that make humans different from each other.
The team’s method, called BlocBuster, computes correlations between each pair of SNPs, then builds a network of those correlations. By observing the network, researchers can see clusters of correlated markers.
“For example, you could build a Facebook network using all of your Facebook friends,” Climer says. “If two of your friends are friends with each other, you would connect them in the network. If you see that a cluster of people is interconnected with each other, they probably share something in common, such as a family relationship, a school, or some type of social interaction. Similarly, with an efficient algorithm and an adequate number of processors and time, we can look at every pair of SNPs, build these networks and observe clusters of interconnected SNPs.
“The BlocBuster approach is a paradigm shift from the conventional methods for genome-wide association studies, or popularly known as GWAS, where one or a few markers were examined at a time,” Zhang says. “It is truly a data mining technique for big data like those from HapMap and 1000 Genomes projects.”
The researchers also can design this approach to look at complex traits and diseases.
“BlocBuster is able to detect combinations of networked genetic markers that are characteristic of complex traits,” Zhang says. “It is suitable for analyzing traits, such as body weights, which are determined by multiple genetic factors, and genetic patterns in populations, such as the yin-yang haplotypes we discovered.”
Ultimately, they expect this method will shed light on the genetic roots of disease.
“Most complex diseases arise due to a group of genetic variations interacting together,” Climer says. “Different groups of people who get a disease may be affected by different groups of variations. There’s not enough power to see most of these intricate associations when looking at single markers one at a time. We’re taking a combinatorial approach — looking at combinations of markers together — and we’re able to see the patterns.”
Climer S, Templeton A, Zhang W. Human gephyrin is encompassed within giant functional noncoding yin-yang sequences. Nature Communications, March 27, 2015. 6:6534 doi: 10.1038/ncomms7534 (2015).
Funding for this research was provided by the National Institutes of Health (P50-GM65509, RC1-AR058681, R01-GM086412 and R-01-GM100364), the National Science Foundation (DBI-0743797) and the municipal government of Wuhan, Hubei, China.
The School of Engineering & Applied Science at Washington University in St. Louis focuses intellectual efforts through a new convergence paradigm and builds on strengths, particularly as applied to medicine and health, energy and environment, entrepreneurship and security. With 91 tenured/tenure-track and 40 additional full-time faculty, 1,300 undergraduate students, more than 900 graduate students and more than 23,000 alumni, we are working to leverage our partnerships with academic and industry partners — across disciplines and across the world — to contribute to solving the greatest global challenges of the 21st century.
Asst Vice Chancellor for Campus Communications
Julie Flory | newswise
21.08.2017 | Albert-Ludwigs-Universität Freiburg im Breisgau
AI implications: Engineer's model lays groundwork for machine-learning device
18.08.2017 | Washington University in St. Louis
Whether you call it effervescent, fizzy, or sparkling, carbonated water is making a comeback as a beverage. Aside from quenching thirst, researchers at the University of Illinois at Urbana-Champaign have discovered a new use for these "bubbly" concoctions that will have major impact on the manufacturer of the world's thinnest, flattest, and one most useful materials -- graphene.
As graphene's popularity grows as an advanced "wonder" material, the speed and quality at which it can be manufactured will be paramount. With that in mind,...
Physicists at the University of Bonn have managed to create optical hollows and more complex patterns into which the light of a Bose-Einstein condensate flows. The creation of such highly low-loss structures for light is a prerequisite for complex light circuits, such as for quantum information processing for a new generation of computers. The researchers are now presenting their results in the journal Nature Photonics.
Light particles (photons) occur as tiny, indivisible portions. Many thousands of these light portions can be merged to form a single super-photon if they are...
For the first time, scientists have shown that circular RNA is linked to brain function. When a RNA molecule called Cdr1as was deleted from the genome of mice, the animals had problems filtering out unnecessary information – like patients suffering from neuropsychiatric disorders.
While hundreds of circular RNAs (circRNAs) are abundant in mammalian brains, one big question has remained unanswered: What are they actually good for? In the...
An experimental small satellite has successfully collected and delivered data on a key measurement for predicting changes in Earth's climate.
The Radiometer Assessment using Vertically Aligned Nanotubes (RAVAN) CubeSat was launched into low-Earth orbit on Nov. 11, 2016, in order to test new...
A study led by scientists of the Max Planck Institute for the Structure and Dynamics of Matter (MPSD) at the Center for Free-Electron Laser Science in Hamburg presents evidence of the coexistence of superconductivity and “charge-density-waves” in compounds of the poorly-studied family of bismuthates. This observation opens up new perspectives for a deeper understanding of the phenomenon of high-temperature superconductivity, a topic which is at the core of condensed matter research since more than 30 years. The paper by Nicoletti et al has been published in the PNAS.
Since the beginning of the 20th century, superconductivity had been observed in some metals at temperatures only a few degrees above the absolute zero (minus...
16.08.2017 | Event News
04.08.2017 | Event News
26.07.2017 | Event News
24.08.2017 | Life Sciences
24.08.2017 | Life Sciences
24.08.2017 | Medical Engineering