Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:


UF scientists encounter holes in tree of life, push for better data storage

When it comes to public access, the tree of life has holes.

A new study co-authored by University of Florida researchers shows about 70 percent of published genetic sequence comparisons are not publicly accessible, leaving researchers worldwide unable to get to critical data they may need to tackle a host a problems ranging from climate change to disease control.

Scientists are using the genetic data to construct the largest open-access tree of life as part of the National Science Foundation's $5.6-million Assembling, Visualizing and Analyzing the Tree of Life project. Understanding organismal relationships is increasingly valuable for tracking the origin and spread of emerging diseases, creating agricultural and pharmaceutical products, studying climate change, controlling invasive species and establishing plans for conservation and ecosystem restoration.

The study appearing today in PLoS Biology describes a significant challenge for the project, which is expected to produce an initial draft tree by the end of the year. It highlights the need for developing more effective methods for storing data for long-term use and urges journals to adopt more stringent data-sharing policies.

"I think what we need is a major change in our mindset about just how important it is to deposit your data – this has to be a standard part of what we do," said co-author Doug Soltis, a distinguished professor at the Florida Museum of Natural History on the UF campus and UF's biology department. "Because if it's not there, it's lost forever. These are really, really important for long-term use, as we're seeing now in our efforts to build a tree."

Estimates of the amount of missing data were based on 7,539 peer-reviewed studies about animals, fungi, seed plants, bacteria and various microscopic organisms. Soltis said the missing genetic data has required project collaborators to contact hundreds of researchers to request information, or attempt to reproduce the sequence alignments and analyses, which is extremely labor intensive.

"There are ambiguities with the alignments, you have to make certain judgment calls, and so an alignment that I do is not going to be the same as an alignment that somebody else does," said lead author Bryan Drew, a postdoctoral researcher in UF's biology department. "It's hard to assess a publication's validity in a lot of cases if you don't have access to the alignments. To me, that's the biggest problem with all of this."

Challenges include complicated mechanisms for uploading data and inconsistencies between journals – some require or strongly recommend data be stored in an online database and others do not, Drew said. The most widely used, publicly accessible databases include GenBank, TreeBASE and Dryad. Most journals require DNA sequences be deposited in GenBank, but comparatively few require the sequence alignments to be publicly archived. When study co-authors emailed researchers to obtain missing information, a majority did not respond, and the co-authors were rarely successful in retrieving the data.

"A lot of the authors I contacted said their data was in TreeBASE, but they were unaware of the next step needed after acceptance by the journal – the researchers didn't know they had to go back into TreeBASE and actually make the data available to the public," Drew said.

Elizabeth Kellogg, a professor in the department of biology at the University of Missouri-St. Louis who was not involved with the study, said she is not surprised about the large amount of missing information.

"They're absolutely right that when people are publishing papers, you want to document your results as much as you can," Kellogg said. "But many journals aren't requiring that extra step, so some researchers are only submitting the minimum to have their studies published. "There are databases for archiving, but some of their interfaces are somewhat cumbersome, and if you haven't previously done this, it can appear to be a daunting task."

Study co-authors include Romina Gazis and David Hibbett of Clark University, Patricia Cabezas of Brigham Young University, Kristen Swithers and Laura A. Katz of Smith College, Keith Crandall of George Washington University, and Jiabin Deng and Roseanna Rodriguez of UF. The National Evolutionary Synthesis Center in Durham, N.C., is leading NSF's "Open Tree of Life" project. For more information visit

Doug Soltis | EurekAlert!
Further information:

More articles from Studies and Analyses:

nachricht Diagnoses: When Are Several Opinions Better Than One?
19.07.2016 | Max-Planck-Institut für Bildungsforschung

nachricht High in calories and low in nutrients when adolescents share pictures of food online
07.04.2016 | University of Gothenburg

All articles from Studies and Analyses >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Etching Microstructures with Lasers

Ultrafast lasers have introduced new possibilities in engraving ultrafine structures, and scientists are now also investigating how to use them to etch microstructures into thin glass. There are possible applications in analytics (lab on a chip) and especially in electronics and the consumer sector, where great interest has been shown.

This new method was born of a surprising phenomenon: irradiating glass in a particular way with an ultrafast laser has the effect of making the glass up to a...

Im Focus: Light-driven atomic rotations excite magnetic waves

Terahertz excitation of selected crystal vibrations leads to an effective magnetic field that drives coherent spin motion

Controlling functional properties by light is one of the grand goals in modern condensed matter physics and materials science. A new study now demonstrates how...

Im Focus: New 3-D wiring technique brings scalable quantum computers closer to reality

Researchers from the Institute for Quantum Computing (IQC) at the University of Waterloo led the development of a new extensible wiring technique capable of controlling superconducting quantum bits, representing a significant step towards to the realization of a scalable quantum computer.

"The quantum socket is a wiring method that uses three-dimensional wires based on spring-loaded pins to address individual qubits," said Jeremy Béjanin, a PhD...

Im Focus: Scientists develop a semiconductor nanocomposite material that moves in response to light

In a paper in Scientific Reports, a research team at Worcester Polytechnic Institute describes a novel light-activated phenomenon that could become the basis for applications as diverse as microscopic robotic grippers and more efficient solar cells.

A research team at Worcester Polytechnic Institute (WPI) has developed a revolutionary, light-activated semiconductor nanocomposite material that can be used...

Im Focus: Diamonds aren't forever: Sandia, Harvard team create first quantum computer bridge

By forcefully embedding two silicon atoms in a diamond matrix, Sandia researchers have demonstrated for the first time on a single chip all the components needed to create a quantum bridge to link quantum computers together.

"People have already built small quantum computers," says Sandia researcher Ryan Camacho. "Maybe the first useful one won't be a single giant quantum computer...

All Focus news of the innovation-report >>>



Event News

#IC2S2: When Social Science meets Computer Science - GESIS will host the IC2S2 conference 2017

14.10.2016 | Event News

Agricultural Trade Developments and Potentials in Central Asia and the South Caucasus

14.10.2016 | Event News

World Health Summit – Day Three: A Call to Action

12.10.2016 | Event News

Latest News

'Neighbor maps' reveal the genome's 3-D shape

27.10.2016 | Life Sciences

Gene therapy shows promise for treating Niemann-Pick disease type C1

27.10.2016 | Life Sciences

Solid progress in carbon capture

27.10.2016 | Power and Electrical Engineering

More VideoLinks >>>