Big Data is everywhere, and we are constantly told that it holds the answers to almost any problem we want to solve. Companies collect information on how we shop, doctors and insurance companies gather our medical test results, and governments compile logs of our phone calls and emails. In each instance, the hope is that critical insights are hidden deep within massive amounts of information, just waiting to be discovered.
Two researchers at Cold Spring Harbor Laboratory challenge the most recent advances in this Big Data analysis, using a classic mathematical concept to tackle the outstanding problems in this field. Mutual information is able to uncover patterns in large lists of numbers, revealing entirely new, unexpected patterns.
But simply having lots of data is not the same as understanding it. Increasingly, new mathematical tools are needed to extract meaning from enormous data sets. In work published online today, two researchers at Cold Spring Harbor Laboratory (CSHL) now challenge the most recent advances in this field, using a classic mathematical concept to tackle the outstanding problems in Big Data analysis.
What does it mean to analyze Big Data? A major goal is to find patterns between seemingly unrelated quantities, such as income and cancer rates. Many of the most common statistical tools are only able to detect patterns if the researcher has some expectation about the relationship between the quantities. Part of the lure of Big Data is that it may reveal entirely new, unexpected patterns. Therefore, scientists and researchers have worked to develop statistical methods that will uncover these novel relationships.
In 2011, a distinguished group of researchers from Harvard University published a highly influential paper in the journal Science that advanced just such a tool. But in a paper published today in Proceedings of the National Academy of Sciences, CSHL Quantitative Biology Fellow Justin Kinney and CSHL Assistant Professor Gurinder "Mickey" Atwal demonstrate that this new tool is critically flawed. "Their statistical tool does not have the mathematical properties that were claimed," says Kinney.
Kinney and Atwal show that the correct tool was hiding in plain sight all along. The solution, they say, is a well known mathematical measure called "mutual information," first described in 1948. It was initially used to quantify the amount of information that could be transmitted electronically through a telephone cable; the concept now underlies the design of the world's telecommunications infrastructure. "What we've found in our work is that this same concept can also be used to find patterns in data," Kinney explains.
Applied to Big Data, mutual information is able to reveal patterns in large lists of numbers. For instance, it can be used to analyze patterns in data sets on the numerous bacterial species that help us digest food. "This particular tool is perfect for finding patterns in studies of the human microbiome, among many other things," Kinney says.
Importantly, mutual information provides a way of identifying all types of patterns within the data without reliance upon any prior assumptions. "Our work shows that mutual information very naturally solves this critical problem in statistics," Kinney says. "This beautiful mathematical concept has the potential to greatly benefit modern data analysis, in biology and in biology and many other important fields.
The research described here was supported by the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory.
"Equitability, mutual information, and the maximal information coefficient" appears online in PNAS on February 17, 2014. The authors are: Justin Block Kinney and Gurinder Singh Atwal. The paper can be obtained online at: http://www.pnas.org/content/early/2014/02/14/1309933111.abstractAbout Cold Spring Harbor Laboratory
Jaclyn Jansen | EurekAlert!
A New Kind of Wood Chip: Collaboration Could Yield Biodegradable Computer Chips
28.05.2015 | University of Wisconsin-Madison
New transregional special research field at the universities of Stuttgart and Constance
28.05.2015 | Universität Stuttgart
Many joining and cutting processes are possible only with lasers. New technologies make it possible to manufacture metal components with hollow structures that are significantly lighter and yet just as stable as solid components. In addition, lasers can be used to combine various lightweight construction materials and steels with each other. The Fraunhofer Institute for Laser Technology ILT in Aachen is presenting a range of such solutions at the LASER World of Photonics trade fair from June 22 to 25, 2015 in Munich, Germany, (Hall A3, Stand 121).
Lightweight construction materials are popular: aluminum is used in the bodywork of cars, for example, and aircraft fuselages already consist in large part of...
Using ultrashort laser pulses, scientists in Max Planck Institute of Quantum Optics have demonstrated the emission of extreme ultraviolet radiation from thin dielectric films and have investigated the underlying mechanisms.
In 1961, only shortly after the invention of the first laser, scientists exposed silicon dioxide crystals (also known as quartz) to an intense ruby laser to...
The only professorship in Germany to date, one master's programme, one laboratory with worldwide unique equipment and the corresponding research results: The University of Würzburg is leading in the field of biofabrication.
Paul Dalton is presently the only professor of biofabrication in Germany. About a year ago, the Australian researcher relocated to the Würzburg department for...
Physicists have developed an innovative method that could enable the efficient use of nanocomponents in electronic circuits. To achieve this, they have developed a layout in which a nanocomponent is connected to two electrical conductors, which uncouple the electrical signal in a highly efficient manner. The scientists at the Department of Physics and the Swiss Nanoscience Institute at the University of Basel have published their results in the scientific journal “Nature Communications” together with their colleagues from ETH Zurich.
Electronic components are becoming smaller and smaller. Components measuring just a few nanometers – the size of around ten atoms – are already being produced...
Development and implementation of an advanced automobile parking navigation platform for parking services
To fulfill the requirements of the industry, PolyU researchers developed the Advanced Automobile Parking Navigation Platform, which includes smart devices,...
20.05.2015 | Event News
18.05.2015 | Event News
12.05.2015 | Event News
29.05.2015 | Life Sciences
29.05.2015 | Earth Sciences
29.05.2015 | Physics and Astronomy