A powerful computing tool that allows scientists to extract features and patterns from enormously large and complex sets of raw data has been developed by scientists at University of California, Davis, and Lawrence Livermore National Laboratory. The tool – a set of problem-solving calculations known as an algorithm – is compact enough to run on computers with as little as two gigabytes of memory.
The team that developed this algorithm has already used it to probe a slew of phenomena represented by billions of data points, including analyzing and creating images of flame surfaces; searching for clusters and voids in a virtual universe experiment; and identifying and tracking pockets of fluid in a simulated mixing of two fluids.
"What we've developed is a workable system of handling any data in any dimension," said Attila Gyulassy, who led the five-year development effort while pursuing a PhD in computer science at UC Davis. "We expect this algorithm will become an integral part of a scientist's toolbox to answer questions about data."
A paper describing the new algorithm was published in the November-December issue of IEEE Transactions on Visualization and Computer Graphics.
Computers are widely used to perform simulations of real-world phenomena and to capture results of physical experiments and observations, storing this information as collections of numbers. But as the size of these data sets has burgeoned, hand-in-hand with computer capacity, analysis has grown increasingly difficult.
A mathematical tool to extract and visualize useful features from data sets has existed for nearly 40 years – in theory. Called the Morse-Smale complex, it partitions sets by similarity of features and encodes them into mathematical terms. But working with the Morse-Smale complex is not easy. "It's a powerful language. But a cost of that, is that using it meaningfully for practical applications is very difficult," Gyulassy said.
Gyulassy's algorithm divides data sets into parcels of cells, then analyzes each parcel separately using the Morse-Smale complex. Results of those computations are then merged together. As new parcels are created from merged parcels, they are analyzed and merged yet again. At each step, data that do not need to be stored in memory are discarded, drastically reducing the computing power required to run the calculations.
One of Gyulassy's tests of the algorithm was to use it to analyze and track the formation and movement of pockets of fluid in the simulated mixing of two fluids: one dense, one light. The complexity of this data set is so vast – it consists of more than one billion data points on a three-dimensional grid – it challenges even supercomputers, Gyulassy said. Yet the new algorithm with its streamlining features was able to perform the analysis on a laptop computer with just two gigabytes of memory. Although Gyulassy had to wait nearly 24 hours for the little machine to complete its calculations, at the end of this process he could pull up images in mere seconds to illustrate phenomena he was interested in, such as the branching of fluid pockets in the mixture.
Two main factors are driving the need for analysis of large data sets, said co-author Bernd Hamann: a surge in the use of powerful computers that can produce huge amounts of data, and an upswing in affordability and availability of sensing devices that researchers deploy in the field and lab to collect a profusion of data.
"Our data files are becoming larger and larger, while the scientist has less and less time to understand them," said Hamann, a professor of computer science and associate vice chancellor for research at UC Davis. "But what are the data good for if we don't have the means of applying mathematically sound and computationally efficient computer analysis tools to look for what is captured in them?"
Gyulassy is currently developing software that will allow others to put the algorithm to use. He expects the learning curve to be steep for this open-source product, "but if you just learn the minimal amount about what a Morse-Smale complex is," he said, "it will be pretty intuitive."
Liese Greensfelder | EurekAlert!
Five developments for improved data exploitation
19.04.2017 | Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, DFKI
Smart Manual Workstations Deliver More Flexible Production
04.04.2017 | Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, DFKI
More and more automobile companies are focusing on body parts made of carbon fiber reinforced plastics (CFRP). However, manufacturing and repair costs must be further reduced in order to make CFRP more economical in use. Together with the Volkswagen AG and five other partners in the project HolQueSt 3D, the Laser Zentrum Hannover e.V. (LZH) has developed laser processes for the automatic trimming, drilling and repair of three-dimensional components.
Automated manufacturing processes are the basis for ultimately establishing the series production of CFRP components. In the project HolQueSt 3D, the LZH has...
Reflecting the structure of composites found in nature and the ancient world, researchers at the University of Illinois at Urbana-Champaign have synthesized thin carbon nanotube (CNT) textiles that exhibit both high electrical conductivity and a level of toughness that is about fifty times higher than copper films, currently used in electronics.
"The structural robustness of thin metal films has significant importance for the reliable operation of smart skin and flexible electronics including...
The nearby, giant radio galaxy M87 hosts a supermassive black hole (BH) and is well-known for its bright jet dominating the spectrum over ten orders of magnitude in frequency. Due to its proximity, jet prominence, and the large black hole mass, M87 is the best laboratory for investigating the formation, acceleration, and collimation of relativistic jets. A research team led by Silke Britzen from the Max Planck Institute for Radio Astronomy in Bonn, Germany, has found strong indication for turbulent processes connecting the accretion disk and the jet of that galaxy providing insights into the longstanding problem of the origin of astrophysical jets.
Supermassive black holes form some of the most enigmatic phenomena in astrophysics. Their enormous energy output is supposed to be generated by the...
The probability to find a certain number of photons inside a laser pulse usually corresponds to a classical distribution of independent events, the so-called...
Microprocessors based on atomically thin materials hold the promise of the evolution of traditional processors as well as new applications in the field of flexible electronics. Now, a TU Wien research team led by Thomas Müller has made a breakthrough in this field as part of an ongoing research project.
Two-dimensional materials, or 2D materials for short, are extremely versatile, although – or often more precisely because – they are made up of just one or a...
20.04.2017 | Event News
18.04.2017 | Event News
03.04.2017 | Event News
24.04.2017 | Physics and Astronomy
24.04.2017 | Materials Sciences
24.04.2017 | Life Sciences