Subscribe free to our newsletters via your
. Space Industry and Business News .

Big data: Searching in large amounts of data quickly and efficiently
by Staff Writers
Saarbrucken, Germany (SPX) Mar 07, 2013

Computer scientists from Saarbrucken have developed an approach which enables searching large amounts of data in a fast and efficient way. Credit: Bellhauser - das bilderwerk.

The term "big data" is defined as a huge amount of digital information, so big and so complex that normal database technology cannot process it. It is not only scientific institutes like the nuclear research center CERN that often store huge amounts of data ("Big Data").

Companies like Google and Facebook do this as well, and analyze it to make better strategic decisions for their business. How successful such an attempt can be was shown in a New York Times article published last year. It reported on the US-based company "Target" which, by analyzing the buying patterns of a young woman, knew about her pregnancy before her father did.

The analyzed amount of data is distributed on several servers on the internet. The search queries go to several servers in parallel. Traditional database management systems do not match all use cases. Either they cannot cope with big data, or they overstrain the user. Therefore data analysts love tools which are based on the open-source software framework Apache Hadoop and which use its efficient file system HDFS.

Those do not require expert knowledge. "If you are used to the programming language Java, you can already do a lot with it", explains Jens Dittrich, professor of information systems at Saarland University. But he also adds that Hadoop is not able to query big datasets as efficiently as database systems that are designed for parallel processing.

Dittrich's and his colleague's solution is the development of the "Hadoop Aggressive Indexing Library", abbreviated with HAIL. It enables saving enormous amounts of data in HDFS in such a way that queries are answered up to 100 times faster.

The researchers use a method which you can already find in a telephone book. So that you do not have to read the complete list of names, the entries are sorted according to surnames. The sorting of the names generates the so-called index.

The researchers generate such an index for the datasets they distribute on several servers. But in contrast to the telephone book, they sort the data according to several criteria at once and store it multiply. "The more criteria you provide, the higher the probability that you find the specified data very fast", Dittrich explains.

"To use the telephone book example again, it means that you have six different books. Every one contains a different sorting of the data - according to name, street, ZIP code, city and telephone number. With the right telephone book you can search according to different criteria and will succeed faster."

In addition to that, Dittrich and his research group managed to generate the indexes without any additional costs. He and his group members organized the indexing in such a way that no additional computing time and delay is required. Even the additional storage space requirement is low.


Related Links
HAIL at Saarland
Space Technology News - Applications and Research

Comment on this article via your Facebook, Yahoo, AOL, Hotmail login.

Share this article via these popular social media networks DiggDigg RedditReddit GoogleGoogle

Memory Foam Mattress Review
Newsletters :: SpaceDaily :: SpaceWar :: TerraDaily :: Energy Daily
XML Feeds :: Space News :: Earth News :: War News :: Solar Energy News

Atoms with Quantum-Memory
Vienna, Austria (SPX) Mar 07, 2013
Order tends towards disorder. This is also true for quantum states. Measurements at the Vienna University of Technology show that in quantum mechanics this transition can be quite different from what we experience in our daily lives. Ice cubes in a cocktail glass melt until an equilibrium state is reached in which the ice cubes are gone. After that, the geometric shape of the ice cubes is ... read more

Atoms with Quantum-Memory

Big data: Searching in large amounts of data quickly and efficiently

Neutron scattering provides data on adsorption of ions in microporous materials

MEXSAT Bicentenario Satellite Completes On-orbit Testing

Space race under way to create quantum satellite

Boeing Receives USAF Contract for Integrated C4ISR Targeting Solution

Air Operations Center Modernization Program PDR Completed

Advanced Communications Waveforms Ported To Navy Digital Modular Radios

Vega launcher integration continues for its April mission

SpaceX's capsule arrives at ISS

Dragon Transporting Two ISS Experiments For AMES

SpaceX Optimistic Despite Dragon Capsule Mishap

China targeting navigation system's global coverage by 2020

Russian GLONASS space satellite group again at full strength

Tracking trains with satellite precision

USAF Awards Lockheed Martin Contracts to Begin Work on Next Set of GPS III Satellites

Canada unsure what will replace Hornets

Cathay Pacific orders 3 Boeing 747-8 cargo planes

Sikorsky, Boeing Propose X2 Technology Helicopter Design for US Army's JMR FVL

Indonesia, South Korea to build fighters

Polymer capacitor dazzles flash manufacturer

Rutgers physicists test highly flexible organic semiconductors

Quantum computers turn mechanical

Boeing Acquires CPU Tech's Microprocessor Business

Twin CU-Boulder instruments reveal a third radiation belt can wrap around Earth

Mysterious electron stash found hidden among Van Allen belts

Satellite SAR capabilities being enhanced

Third radiation belt discovered with UNH-led instrument suite

Toxic gas leak in South Korea, 11 hospitalised

Japan warns about smog drifting from China

Electronic waste recycling on the increase

Stanford scientists help shed light on key component of China's pollution problem

The content herein, unless otherwise known to be public domain, are Copyright 1995-2014 - Space Media Network. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA Portal Reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. Advertising does not imply endorsement,agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. Privacy Statement