A new study finds that -- even in a field with clear standards and online databases -- the rate of public data archiving in cancer research is increasing only slowly. Furthermore, research studies in cancer and human subjects are less likely than other research studies to make their datasets available for reuse.
The results come from a study of patterns of research data availability conducted by Dr Heather Piwowar of the National Evolutionary Synthesis Center.
Data collected in scientific research is often useful for future studies by other investigators, but scientists have rarely made their raw research data widely available. Tools and initiatives are underway to encourage scientists to publicly archive their data. This analysis confirms there is still much room for improvement.
By querying the full text of the scientific literature through websites like Google Scholar and PubMed Central, Piwowar identified eleven thousand studies that collected a particular type of data about cellular activity, called gene expression microarray data. Only 45% of recent gene expression studies were found to have deposited their data in the public databases developed for this purpose. The rate of data publication has increased only slightly from 2007 to 2009. Data is shared least often from studies on cancer and human subjects: cancer studies make their data available for wide reuse half as often as similar studies outside cancer.
"It was disheartening to discover that studies on cancer and human subjects were least likely to make their data available. These data are surely some of the most valuable for reuse, to confirm, refute, inform and advance bench-to-bedside translational research," Piwowar said.
"We want as much scientific progress as we can get from our tax and charity dollars. This requires increased access to data resources. Data can be shared while maintaining patient privacy," Piwowar added, noting that patient re-identification is rarely an issue for gene expression microarray studies.
Most likely to share their data in public databases were investigators from Stanford University and those who published in the journal Physiological Genomics.
Scientist sometimes email each other to request datasets that aren't available online, but these requests often go unanswered or are denied by the original investigators. Publishing data in online data repositories is considered the best way to share data for future reuse.
Recent policies by the NSF seek to increase the amount of data disseminated from federally-funded research by requiring data management and dissemination plans in all new grant applications.
The findings were published July 13th in the open access journal PLoS ONE.
CITATION: Piwowar, H. (2011). "Who shares? Who doesn't? Factors associated with openly archiving raw research data." PLoS ONE 6(7): e18657. doi:18610.11371/journal.pone.0018657
In the spirit of the topic, the raw data behind this study are publicly available in the Dryad Digital Repository at http://dx.doi.org/10.5061/dryad.mf1sd.
The National Evolutionary Synthesis Center (NESCent) is a nonprofit science center dedicated to cross-disciplinary research in evolution. Funded by the National Science Foundation, NESCent is jointly operated by Duke University, The University of North Carolina at Chapel Hill, and North Carolina State University. For more information about research and training opportunities at NESCent, visit www.nescent.org.
Heather Piwowar | EurekAlert!
Study relating to materials testing Detecting damages in non-magnetic steel through magnetism
23.07.2018 | Technische Universität Kaiserslautern
Innovative genetic tests for children with developmental disorders and epilepsy
11.07.2018 | Christian-Albrechts-Universität zu Kiel
Scientists at the University of California, Los Angeles present new research on a curious cosmic phenomenon known as "whistlers" -- very low frequency packets...
Scientists develop first tool to use machine learning methods to compute flow around interactively designable 3D objects. Tool will be presented at this year’s prestigious SIGGRAPH conference.
When engineers or designers want to test the aerodynamic properties of the newly designed shape of a car, airplane, or other object, they would normally model...
Researchers from TU Graz and their industry partners have unveiled a world first: the prototype of a robot-controlled, high-speed combined charging system (CCS) for electric vehicles that enables series charging of cars in various parking positions.
Global demand for electric vehicles is forecast to rise sharply: by 2025, the number of new vehicle registrations is expected to reach 25 million per year....
Proteins must be folded correctly to fulfill their molecular functions in cells. Molecular assistants called chaperones help proteins exploit their inbuilt folding potential and reach the correct three-dimensional structure. Researchers at the Max Planck Institute of Biochemistry (MPIB) have demonstrated that actin, the most abundant protein in higher developed cells, does not have the inbuilt potential to fold and instead requires special assistance to fold into its active state. The chaperone TRiC uses a previously undescribed mechanism to perform actin folding. The study was recently published in the journal Cell.
Actin is the most abundant protein in highly developed cells and has diverse functions in processes like cell stabilization, cell division and muscle...
Scientists have discovered that the electrical resistance of a copper-oxide compound depends on the magnetic field in a very unusual way -- a finding that could help direct the search for materials that can perfectly conduct electricity at room temperatur
What happens when really powerful magnets--capable of producing magnetic fields nearly two million times stronger than Earth's--are applied to materials that...
08.08.2018 | Event News
27.07.2018 | Event News
25.07.2018 | Event News
15.08.2018 | Physics and Astronomy
15.08.2018 | Earth Sciences
15.08.2018 | Physics and Astronomy