Two USC scientists have developed an algorithm that could help make DNA sequencing affordable enough for clinics – and could be useful to researchers of all stripes.
Andrew Smith, a computational biologist at the USC Dornsife College of Letters, Arts and Sciences, developed the algorithm along with USC graduate student Timothy Daley to help predict the value of sequencing more DNA, to be published in Nature Methods on February 24.
Extracting information from the DNA means deciding how much to sequence: sequencing too little and you may not get the answers you are looking for, but sequence too much and you will waste both time and money. That expensive gamble is a big part of what keeps DNA sequencing out of the hands of clinicians. But not for long, according to Smith.
"It seems likely that some clinical applications of DNA sequencing will become routine in the next five to 10 years," Smith said. "For example, diagnostic sequencing to understand the properties of a tumor will be much more effective if the right mathematical methods are in place."
The beauty of Smith and Daley's algorithm, which predicts the size and composition of an unseen population based on a small sample, lies in its broad applicability.
"This is one of those great instances where a specific challenge in our research led us to uncover a powerful algorithm that has surprisingly broad applications," Smith said.
Think of it: how often do scientists need to predict what they haven't seen based on what they have? Public health officials could use the algorithm to estimate the population of HIV positive individuals; astronomers could use it to determine how many exoplanets exist in our galaxy based on the ones they have already discovered; and biologists could use it to estimate the diversity of antibodies in an individual.
The mathematical underpinnings of the algorithm rely on a model of sampling from ecology known as capture-recapture. In this model, individuals are captured and tagged so that a recapture of the same individual will be known – and the number of times each individual was captured can be used to make inferences about the population as a whole.
In this way scientists can estimate, for example, the number of gorillas remaining in the wild. In DNA sequencing, the individuals are the various different genomic molecules in a sample. However, the mathematical models used for counting gorillas don't work on the scale of DNA sequencing.
"The basic model has been known for decades, but the way it has been used makes it highly unstable in most applications. We took a different approach that depends on lots of computing power and seems to work best in large-scale applications like modern DNA sequencing," Daley said.
Scientists faced a similar problem in the early days of the human genome sequencing project. A mathematical solution was provided by Michael Waterman of USC, in 1988, which found widespread use. Recent advances in sequencing technology, however, require thinking differently about the mathematical properties of DNA sequencing data.
"Huge data sets required a novel approach. I'm very please it was developed here at USC," said Waterman.
This research was funded by grants from the National Institutes of Health National Human Genome Research Institute (R01 HG005238 and P50 HG002790).
Robert Perkins | EurekAlert!
Zebrafish's near 360 degree UV-vision knocks stripes off Google Street View
22.06.2018 | University of Sussex
New cellular pathway helps explain how inflammation leads to artery disease
22.06.2018 | Cedars-Sinai Medical Center
In a recent publication in the renowned journal Optica, scientists of Leibniz-Institute of Photonic Technology (Leibniz IPHT) in Jena showed that they can accurately control the optical properties of liquid-core fiber lasers and therefore their spectral band width by temperature and pressure tuning.
Already last year, the researchers provided experimental proof of a new dynamic of hybrid solitons– temporally and spectrally stationary light waves resulting...
Scientists from the University of Freiburg and the University of Basel identified a master regulator for bone regeneration. Prasad Shastri, Professor of...
Moving into its fourth decade, AchemAsia is setting out for new horizons: The International Expo and Innovation Forum for Sustainable Chemical Production will take place from 21-23 May 2019 in Shanghai, China. With an updated event profile, the eleventh edition focusses on topics that are especially relevant for the Chinese process industry, putting a strong emphasis on sustainability and innovation.
Founded in 1989 as a spin-off of ACHEMA to cater to the needs of China’s then developing industry, AchemAsia has since grown into a platform where the latest...
The BMBF-funded OWICELLS project was successfully completed with a final presentation at the BMW plant in Munich. The presentation demonstrated a Li-Fi communication with a mobile robot, while the robot carried out usual production processes (welding, moving and testing parts) in a 5x5m² production cell. The robust, optical wireless transmission is based on spatial diversity; in other words, data is sent and received simultaneously by several LEDs and several photodiodes. The system can transmit data at more than 100 Mbit/s and five milliseconds latency.
Modern production technologies in the automobile industry must become more flexible in order to fulfil individual customer requirements.
An international team of scientists has discovered a new way to transfer image information through multimodal fibers with almost no distortion - even if the fiber is bent. The results of the study, to which scientist from the Leibniz-Institute of Photonic Technology Jena (Leibniz IPHT) contributed, were published on 6thJune in the highly-cited journal Physical Review Letters.
Endoscopes allow doctors to see into a patient’s body like through a keyhole. Typically, the images are transmitted via a bundle of several hundreds of optical...
13.06.2018 | Event News
08.06.2018 | Event News
05.06.2018 | Event News
22.06.2018 | Materials Sciences
22.06.2018 | Earth Sciences
22.06.2018 | Life Sciences