The world has gone digital in just about everything we do. Almost every iota of information we access these days is stored in some kind of digital form and accessed electronically -- text, charts, images, video, music, you name it. The key questions are: Will your data be there when you need it? And who’s going to preserve it?
In the December 2008 edition of Communications of the ACM, the monthly magazine of the Association for Computing Machinery, Dr. Fran Berman, director of the San Diego Supercomputer Center (SDSC) at the University of California, San Diego, provides a guide for surviving what has become known as the “data deluge.”
Managing this deluge and preserving what’s important is what Berman refers to as one of the “grand challenges” of the Information Age. The amount of digital data is immense: A 2008 report by the International Data Corporation (IDC), a global provider of information technology intelligence based in Framingham, Mass., predicts that by 2011, our “digital universe” will be 10 times the size it was in 2006 - and almost half of this universe will not have a permanent home as the amount of digital information outstrips storage space.
“As a society, we have only begun to address this challenge at a scale concomitant with the deluge of data available to us and its importance in the modern world,” writes Berman, a longtime pioneer in cyberinfrastructure – an open but organized aggregate of information technologies including computers, data archives, networks, software, digital instruments, and other scientific endeavors that support 21st century life and work.
Berman is a strong advocate of cyberinfrastructure that supports the management and preservation of digital data in the Information Age – data cyberinfrastructure: “Just like the physical infrastructures all around us -- roads, bridges, water and electricity – we need a data cyberinfrastructure that is stable, predictable, and cost-effective.”
In her article, Berman explores key trends and issues associated with preserving digital data, and what’s required to keep it manageable, accessible, available, and secure. However, she warns that there is no “one-size-fits-all” solution for data stewardship and preservation.
Berman’s ACM article closes with a set of “Top 10” guidelines for data stewardship:
1. Make a plan. Create an explicit strategy for stewardship and preservation for your data, from its inception to the end of its lifetime; explicitly consider what that lifetime may be.
2. Be aware of data costs and include them in your overall IT budget. Ensure that all costs are factored in, including hardware, software, expert support, and time. Determine whether it is more cost-effective to regenerate some of your information rather than preserve it over a long period.
3. Associate metadata with your data. Metadata is needed to be able to find and use your data immediately and for years to come. Identify relevant standards for data/metadata content and format, following them to ensure the data can be used by others.
4. Make multiple copies of valuable data. Store some of them off-site and in different systems.
5. Plan for the transition of digital data to new storage media ahead of time. Include budgetary planning for new storage and software technologies, file format migrations, and time. Migration must be an ongoing process. Migrate data to new technologies before your storage media becomes obsolete.
6. Plan for transitions in data stewardship. If the data will eventually be turned over to a formal repository, institution, or other custodial environment, ensure it meets the requirements of the new environment and that the new steward indeed agrees to take it on.
7. Determine the level of “trust” required when choosing how to archive data. Are the resources of the U.S. National Archives and Records Administration necessary or will Google do?
8. Tailor plans for preservation and access to the expected use. Gene-sequence data used daily by hundreds of thousands of researchers worldwide may need a different preservation and access infrastructure from, for example, digital photos viewed occasionally by family members.
9. Pay attention to security. Be aware of what you must do to maintain the integrity of your data.
10. Know the regulations. Know whether copyright, the Health Insurance Portability and Accountability Act of 1996, the Sarbanes-Oxley Act of 2002, the U.S. National Institutes of Health publishing expectations, or other policies and/or regulations are relevant to your data, ensuring your approach to stewardship and publication is compliant.
Berman is a national leader in this area and also co-chairs of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access with OCLC economist Brian Lavoie. The task force was formed late last year to explore and ultimately present a range of economic models, components, and actionable recommendations for sustainable preservation and access of digital data in the public interest. Commissioned for two years, the task force will publish an interim report outlining economic issues and systemic challenges associated with digital preservation later this month on its website.
For Berman’s full Communications of the ACM article, please see: http://www.sdsc.edu/about/director/pubs/communications200812-DataDeluge.pdf
Jan Zverina | Newswise Science News
Five developments for improved data exploitation
19.04.2017 | Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, DFKI
Smart Manual Workstations Deliver More Flexible Production
04.04.2017 | Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, DFKI
More and more automobile companies are focusing on body parts made of carbon fiber reinforced plastics (CFRP). However, manufacturing and repair costs must be further reduced in order to make CFRP more economical in use. Together with the Volkswagen AG and five other partners in the project HolQueSt 3D, the Laser Zentrum Hannover e.V. (LZH) has developed laser processes for the automatic trimming, drilling and repair of three-dimensional components.
Automated manufacturing processes are the basis for ultimately establishing the series production of CFRP components. In the project HolQueSt 3D, the LZH has...
Reflecting the structure of composites found in nature and the ancient world, researchers at the University of Illinois at Urbana-Champaign have synthesized thin carbon nanotube (CNT) textiles that exhibit both high electrical conductivity and a level of toughness that is about fifty times higher than copper films, currently used in electronics.
"The structural robustness of thin metal films has significant importance for the reliable operation of smart skin and flexible electronics including...
The nearby, giant radio galaxy M87 hosts a supermassive black hole (BH) and is well-known for its bright jet dominating the spectrum over ten orders of magnitude in frequency. Due to its proximity, jet prominence, and the large black hole mass, M87 is the best laboratory for investigating the formation, acceleration, and collimation of relativistic jets. A research team led by Silke Britzen from the Max Planck Institute for Radio Astronomy in Bonn, Germany, has found strong indication for turbulent processes connecting the accretion disk and the jet of that galaxy providing insights into the longstanding problem of the origin of astrophysical jets.
Supermassive black holes form some of the most enigmatic phenomena in astrophysics. Their enormous energy output is supposed to be generated by the...
The probability to find a certain number of photons inside a laser pulse usually corresponds to a classical distribution of independent events, the so-called...
Microprocessors based on atomically thin materials hold the promise of the evolution of traditional processors as well as new applications in the field of flexible electronics. Now, a TU Wien research team led by Thomas Müller has made a breakthrough in this field as part of an ongoing research project.
Two-dimensional materials, or 2D materials for short, are extremely versatile, although – or often more precisely because – they are made up of just one or a...
20.04.2017 | Event News
18.04.2017 | Event News
03.04.2017 | Event News
25.04.2017 | Earth Sciences
25.04.2017 | Life Sciences
25.04.2017 | Earth Sciences