A new study co-authored by University of Florida researchers shows about 70 percent of published genetic sequence comparisons are not publicly accessible, leaving researchers worldwide unable to get to critical data they may need to tackle a host a problems ranging from climate change to disease control.
Scientists are using the genetic data to construct the largest open-access tree of life as part of the National Science Foundation's $5.6-million Assembling, Visualizing and Analyzing the Tree of Life project. Understanding organismal relationships is increasingly valuable for tracking the origin and spread of emerging diseases, creating agricultural and pharmaceutical products, studying climate change, controlling invasive species and establishing plans for conservation and ecosystem restoration.
The study appearing today in PLoS Biology describes a significant challenge for the project, which is expected to produce an initial draft tree by the end of the year. It highlights the need for developing more effective methods for storing data for long-term use and urges journals to adopt more stringent data-sharing policies.
"I think what we need is a major change in our mindset about just how important it is to deposit your data – this has to be a standard part of what we do," said co-author Doug Soltis, a distinguished professor at the Florida Museum of Natural History on the UF campus and UF's biology department. "Because if it's not there, it's lost forever. These are really, really important for long-term use, as we're seeing now in our efforts to build a tree."
Estimates of the amount of missing data were based on 7,539 peer-reviewed studies about animals, fungi, seed plants, bacteria and various microscopic organisms. Soltis said the missing genetic data has required project collaborators to contact hundreds of researchers to request information, or attempt to reproduce the sequence alignments and analyses, which is extremely labor intensive.
"There are ambiguities with the alignments, you have to make certain judgment calls, and so an alignment that I do is not going to be the same as an alignment that somebody else does," said lead author Bryan Drew, a postdoctoral researcher in UF's biology department. "It's hard to assess a publication's validity in a lot of cases if you don't have access to the alignments. To me, that's the biggest problem with all of this."
Challenges include complicated mechanisms for uploading data and inconsistencies between journals – some require or strongly recommend data be stored in an online database and others do not, Drew said. The most widely used, publicly accessible databases include GenBank, TreeBASE and Dryad. Most journals require DNA sequences be deposited in GenBank, but comparatively few require the sequence alignments to be publicly archived. When study co-authors emailed researchers to obtain missing information, a majority did not respond, and the co-authors were rarely successful in retrieving the data.
"A lot of the authors I contacted said their data was in TreeBASE, but they were unaware of the next step needed after acceptance by the journal – the researchers didn't know they had to go back into TreeBASE and actually make the data available to the public," Drew said.
Elizabeth Kellogg, a professor in the department of biology at the University of Missouri-St. Louis who was not involved with the study, said she is not surprised about the large amount of missing information.
"They're absolutely right that when people are publishing papers, you want to document your results as much as you can," Kellogg said. "But many journals aren't requiring that extra step, so some researchers are only submitting the minimum to have their studies published. "There are databases for archiving, but some of their interfaces are somewhat cumbersome, and if you haven't previously done this, it can appear to be a daunting task."
Study co-authors include Romina Gazis and David Hibbett of Clark University, Patricia Cabezas of Brigham Young University, Kristen Swithers and Laura A. Katz of Smith College, Keith Crandall of George Washington University, and Jiabin Deng and Roseanna Rodriguez of UF. The National Evolutionary Synthesis Center in Durham, N.C., is leading NSF's "Open Tree of Life" project. For more information visit http://www.opentreeoflife.org
The personality factor: How to foster the sharing of research data
06.09.2017 | ZBW – Leibniz-Informationszentrum Wirtschaft
Europe’s Demographic Future. Where the Regions Are Heading after a Decade of Crises
10.08.2017 | Berlin-Institut für Bevölkerung und Entwicklung
Plants and algae use the enzyme Rubisco to fix carbon dioxide, removing it from the atmosphere and converting it into biomass. Algae have figured out a way to increase the efficiency of carbon fixation. They gather most of their Rubisco into a ball-shaped microcompartment called the pyrenoid, which they flood with a high local concentration of carbon dioxide. A team of scientists at Princeton University, the Carnegie Institution for Science, Stanford University and the Max Plank Institute of Biochemistry have unravelled the mysteries of how the pyrenoid is assembled. These insights can help to engineer crops that remove more carbon dioxide from the atmosphere while producing more food.
A warming planet
Our brains house extremely complex neuronal circuits, whose detailed structures are still largely unknown. This is especially true for the so-called cerebral cortex of mammals, where among other things vision, thoughts or spatial orientation are being computed. Here the rules by which nerve cells are connected to each other are only partly understood. A team of scientists around Moritz Helmstaedter at the Frankfiurt Max Planck Institute for Brain Research and Helene Schmidt (Humboldt University in Berlin) have now discovered a surprisingly precise nerve cell connectivity pattern in the part of the cerebral cortex that is responsible for orienting the individual animal or human in space.
The researchers report online in Nature (Schmidt et al., 2017. Axonal synapse sorting in medial entorhinal cortex, DOI: 10.1038/nature24005) that synapses in...
Whispering gallery mode (WGM) resonators are used to make tiny micro-lasers, sensors, switches, routers and other devices. These tiny structures rely on a...
Using ultrafast flashes of laser and x-ray radiation, scientists at the Max Planck Institute of Quantum Optics (Garching, Germany) took snapshots of the briefest electron motion inside a solid material to date. The electron motion lasted only 750 billionths of the billionth of a second before it fainted, setting a new record of human capability to capture ultrafast processes inside solids!
When x-rays shine onto solid materials or large molecules, an electron is pushed away from its original place near the nucleus of the atom, leaving a hole...
For the first time, physicists have successfully imaged spiral magnetic ordering in a multiferroic material. These materials are considered highly promising candidates for future data storage media. The researchers were able to prove their findings using unique quantum sensors that were developed at Basel University and that can analyze electromagnetic fields on the nanometer scale. The results – obtained by scientists from the University of Basel’s Department of Physics, the Swiss Nanoscience Institute, the University of Montpellier and several laboratories from University Paris-Saclay – were recently published in the journal Nature.
Multiferroics are materials that simultaneously react to electric and magnetic fields. These two properties are rarely found together, and their combined...
19.09.2017 | Event News
12.09.2017 | Event News
06.09.2017 | Event News
22.09.2017 | Life Sciences
22.09.2017 | Life Sciences
21.09.2017 | Physics and Astronomy