In order to search through this data companies like Facebook, Ebay, Yahoo, and Twitter use the freely available Hadoop software - a variant of MapReduce originally proposed by Google. Several database experts, however, criticize Hadoop for being inefficient.
Computer scientists from Saarland University (Germany) are now proposing a new system coined Hadoop++. It allows users to search through very large datasets much faster than before. Hadoop++ improves over Hadoop by up to a factor of 20.
Internet companies need to process data volumes on the order of millions of Gigabytes (Petabytes) on a daily basis. In order to search through this data effectively, Google proposed the MapReduce programming model. MapReduce divides the input data into smaller chunks that are then distributed over a large network of machines and processed in parallel. The open-source counterpart of MapReduce is called Hadoop. Even though Google holds a patent on MapReduce, it granted a license to Hadoop. Therefore, Hadoop may still be used by companies free of charge. However, "Database experts who are fluent in SQL consider MapReduce a major step backwards towards the database stone age", explains Jens Dittrich, Professor of Information Systems at Saarland University. He adds, "MapReduce disregards considerable wisdom from database research. As a consequence MapReduce is often slow and inefficient."
Despite this criticism, over the past years Hadoop has gained considerable attention from both industry and academia. Hadoop is very popular among programmers throughout the world. Professor Dittrich explains why, "The reason is its ease of use: the user neither has to learn a complex database language nor a data model. Furthermore, Hadoop is relatively easy to administrate. In summary, Hadoop allows even database-illiterate people to search through billions of records on very large computer clusters."
However, this comes at a price: "When compared to modern relational database management systems, Hadoop is just too slow." Therefore the researcher and his team at Saarland University have developed a new system, coined Hadoop++. It aims to eliminate the performance deficiencies encountered in Hadoop. The creativity of the new approach lies in how the problem is tackled: Hadoop++ works similarly to a "trojan", i.e. a computer virus which infects a computer system clandestinely and may then cause considerable harm. In contrast to these bad trojans, Hadoop++ injects hidden code into a system in order to "heal" it, ie. dynamically accelerating the underlying Hadoop. Professor Dittrich emphasizes, "One could say that Hadoop++ is a good trojan".
The Saarbrücken approach has the considerable advantage that Hadoop's tested code base does not have to be modified and retested. Thus, Hadoop++ avoids complex changes to a working system and unforeseen consequences. Hadoop++ will be presented at this year's International Conference on Very Large Databases (VLDB) - one of the world's most prestigious database conferences to be held in Singapore from September 13-17.
There have been heated discussions about the pros and cons of MapReduce/Hadoop when compared to traditional database management systems. This discussion was led by database professors in the US. Please refer to the links below for details. Recently there have been attempts to improve the runtime efficiency of MapReduce. However, as Professor Dittrich explains, "those attempts could not really marry MapReduce with database technology."
Press Pictures: www.uni-saarland.de/pressefotosFor questions, contact:
Friederike Meyer zu Tittingdorf | idw
'Building up' stretchable electronics to be as multipurpose as your smartphone
14.08.2018 | University of California - San Diego
New interactive machine learning tool makes car designs more aerodynamic
14.08.2018 | Institute of Science and Technology Austria
Scientists develop first tool to use machine learning methods to compute flow around interactively designable 3D objects. Tool will be presented at this year’s prestigious SIGGRAPH conference.
When engineers or designers want to test the aerodynamic properties of the newly designed shape of a car, airplane, or other object, they would normally model...
Researchers from TU Graz and their industry partners have unveiled a world first: the prototype of a robot-controlled, high-speed combined charging system (CCS) for electric vehicles that enables series charging of cars in various parking positions.
Global demand for electric vehicles is forecast to rise sharply: by 2025, the number of new vehicle registrations is expected to reach 25 million per year....
Proteins must be folded correctly to fulfill their molecular functions in cells. Molecular assistants called chaperones help proteins exploit their inbuilt folding potential and reach the correct three-dimensional structure. Researchers at the Max Planck Institute of Biochemistry (MPIB) have demonstrated that actin, the most abundant protein in higher developed cells, does not have the inbuilt potential to fold and instead requires special assistance to fold into its active state. The chaperone TRiC uses a previously undescribed mechanism to perform actin folding. The study was recently published in the journal Cell.
Actin is the most abundant protein in highly developed cells and has diverse functions in processes like cell stabilization, cell division and muscle...
Scientists have discovered that the electrical resistance of a copper-oxide compound depends on the magnetic field in a very unusual way -- a finding that could help direct the search for materials that can perfectly conduct electricity at room temperatur
What happens when really powerful magnets--capable of producing magnetic fields nearly two million times stronger than Earth's--are applied to materials that...
The quality of materials often depends on the manufacturing process. In casting and welding, for example, the rate at which melts solidify and the resulting microstructure of the alloy is important. With metallic foams as well, it depends on exactly how the foaming process takes place. To understand these processes fully requires fast sensing capability. The fastest 3D tomographic images to date have now been achieved at the BESSY II X-ray source operated by the Helmholtz-Zentrum Berlin.
Dr. Francisco Garcia-Moreno and his team have designed a turntable that rotates ultra-stably about its axis at a constant rotational speed. This really depends...
08.08.2018 | Event News
27.07.2018 | Event News
25.07.2018 | Event News
14.08.2018 | Information Technology
14.08.2018 | Life Sciences
14.08.2018 | Life Sciences