Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Computer Scientists from Saarbrücken Search Through Large Datasets using "Good Trojans"

13.09.2010
Social networks, search engines, digital archives, and several global-scale internet companies host very large data collections.

In order to search through this data companies like Facebook, Ebay, Yahoo, and Twitter use the freely available Hadoop software - a variant of MapReduce originally proposed by Google. Several database experts, however, criticize Hadoop for being inefficient.

Computer scientists from Saarland University (Germany) are now proposing a new system coined Hadoop++. It allows users to search through very large datasets much faster than before. Hadoop++ improves over Hadoop by up to a factor of 20.

Internet companies need to process data volumes on the order of millions of Gigabytes (Petabytes) on a daily basis. In order to search through this data effectively, Google proposed the MapReduce programming model. MapReduce divides the input data into smaller chunks that are then distributed over a large network of machines and processed in parallel. The open-source counterpart of MapReduce is called Hadoop. Even though Google holds a patent on MapReduce, it granted a license to Hadoop. Therefore, Hadoop may still be used by companies free of charge. However, "Database experts who are fluent in SQL consider MapReduce a major step backwards towards the database stone age", explains Jens Dittrich, Professor of Information Systems at Saarland University. He adds, "MapReduce disregards considerable wisdom from database research. As a consequence MapReduce is often slow and inefficient."

Despite this criticism, over the past years Hadoop has gained considerable attention from both industry and academia. Hadoop is very popular among programmers throughout the world. Professor Dittrich explains why, "The reason is its ease of use: the user neither has to learn a complex database language nor a data model. Furthermore, Hadoop is relatively easy to administrate. In summary, Hadoop allows even database-illiterate people to search through billions of records on very large computer clusters."

However, this comes at a price: "When compared to modern relational database management systems, Hadoop is just too slow." Therefore the researcher and his team at Saarland University have developed a new system, coined Hadoop++. It aims to eliminate the performance deficiencies encountered in Hadoop. The creativity of the new approach lies in how the problem is tackled: Hadoop++ works similarly to a "trojan", i.e. a computer virus which infects a computer system clandestinely and may then cause considerable harm. In contrast to these bad trojans, Hadoop++ injects hidden code into a system in order to "heal" it, ie. dynamically accelerating the underlying Hadoop. Professor Dittrich emphasizes, "One could say that Hadoop++ is a good trojan".

The Saarbrücken approach has the considerable advantage that Hadoop's tested code base does not have to be modified and retested. Thus, Hadoop++ avoids complex changes to a working system and unforeseen consequences. Hadoop++ will be presented at this year's International Conference on Very Large Databases (VLDB) - one of the world's most prestigious database conferences to be held in Singapore from September 13-17.

Background

There have been heated discussions about the pros and cons of MapReduce/Hadoop when compared to traditional database management systems. This discussion was led by database professors in the US. Please refer to the links below for details. Recently there have been attempts to improve the runtime efficiency of MapReduce. However, as Professor Dittrich explains, "those attempts could not really marry MapReduce with database technology."

Press Pictures: www.uni-saarland.de/pressefotos

For questions, contact:
Jens Dittrich
Professor of Information Systems
Saarland University
Tel. (+49) 681 302 70141
Weitere Informationen:
http://infosys.cs.uni-saarland.de/hadoop++.php
http://hadoop.apache.org/
http://www.mapreduce.org/
http://www.vldb2010.org

Friederike Meyer zu Tittingdorf | idw
Further information:
http://www.uni-saarland.de

More articles from Information Technology:

nachricht 'Building up' stretchable electronics to be as multipurpose as your smartphone
14.08.2018 | University of California - San Diego

nachricht New interactive machine learning tool makes car designs more aerodynamic
14.08.2018 | Institute of Science and Technology Austria

All articles from Information Technology >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: New interactive machine learning tool makes car designs more aerodynamic

Scientists develop first tool to use machine learning methods to compute flow around interactively designable 3D objects. Tool will be presented at this year’s prestigious SIGGRAPH conference.

When engineers or designers want to test the aerodynamic properties of the newly designed shape of a car, airplane, or other object, they would normally model...

Im Focus: Robots as 'pump attendants': TU Graz develops robot-controlled rapid charging system for e-vehicles

Researchers from TU Graz and their industry partners have unveiled a world first: the prototype of a robot-controlled, high-speed combined charging system (CCS) for electric vehicles that enables series charging of cars in various parking positions.

Global demand for electric vehicles is forecast to rise sharply: by 2025, the number of new vehicle registrations is expected to reach 25 million per year....

Im Focus: The “TRiC” to folding actin

Proteins must be folded correctly to fulfill their molecular functions in cells. Molecular assistants called chaperones help proteins exploit their inbuilt folding potential and reach the correct three-dimensional structure. Researchers at the Max Planck Institute of Biochemistry (MPIB) have demonstrated that actin, the most abundant protein in higher developed cells, does not have the inbuilt potential to fold and instead requires special assistance to fold into its active state. The chaperone TRiC uses a previously undescribed mechanism to perform actin folding. The study was recently published in the journal Cell.

Actin is the most abundant protein in highly developed cells and has diverse functions in processes like cell stabilization, cell division and muscle...

Im Focus: Lining up surprising behaviors of superconductor with one of the world's strongest magnets

Scientists have discovered that the electrical resistance of a copper-oxide compound depends on the magnetic field in a very unusual way -- a finding that could help direct the search for materials that can perfectly conduct electricity at room temperatur

What happens when really powerful magnets--capable of producing magnetic fields nearly two million times stronger than Earth's--are applied to materials that...

Im Focus: World record: Fastest 3-D tomographic images at BESSY II

The quality of materials often depends on the manufacturing process. In casting and welding, for example, the rate at which melts solidify and the resulting microstructure of the alloy is important. With metallic foams as well, it depends on exactly how the foaming process takes place. To understand these processes fully requires fast sensing capability. The fastest 3D tomographic images to date have now been achieved at the BESSY II X-ray source operated by the Helmholtz-Zentrum Berlin.

Dr. Francisco Garcia-Moreno and his team have designed a turntable that rotates ultra-stably about its axis at a constant rotational speed. This really depends...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

VideoLinks
Industry & Economy
Event News

Within reach of the Universe

08.08.2018 | Event News

A journey through the history of microscopy – new exhibition opens at the MDC

27.07.2018 | Event News

2018 Work Research Conference

25.07.2018 | Event News

 
Latest News

'Building up' stretchable electronics to be as multipurpose as your smartphone

14.08.2018 | Information Technology

During HIV infection, antibody can block B cells from fighting pathogens

14.08.2018 | Life Sciences

First study on physical properties of giant cancer cells may inform new treatments

14.08.2018 | Life Sciences

VideoLinks
Science & Research
Overview of more VideoLinks >>>