by Staff Writers
Beijing, China (SPX) Apr 27, 2014
Advances in the technology frontier have resulted in major disruptions and transformations in the massive data processing infrastructures. For the past three decades, classical database management systems, data warehousing and data analysis technologies have been well recognized as effective tools for data management and analysis. More recently, data from different sources and in different format are being collected at unprecedented scale.
This gives rise to the so-called 3V characteristics of the big data: volume, velocity and variety. Classical approaches of data warehousing and data analysis are no longer viable to deal with both the scale of data and the sophisticated analysis. This challenge has been labeled as the 'big data' problem. In principle, while earlier DBMSs focused on modeling operational characteristics of enterprises, big data systems are now expected to model vast amounts of heterogeneous and complex data.
Although the massive data pose many challenges and invalidate earlier designs, they provide many great opportunities, and most of all, instead of making decisions based on small sets of data or calibration, decisions can now be made based on the data itself. Various big data applications have emerged, such as Social networking, Enterprise data management, Scientific applications, Mobile computing, Scalable and elastic data management, Scalable data analytics, etc .
Meanwhile, many distributed data processing frameworks/systems have been proposed to deal with big data problem. MapReduce is the most successful distributed computing platform whose fundamental idea is to simplify the parallel processing, and has been widely applied. MapReduce systems are good at complex analytics and extract-transform-load tasks at large scale, however it also suffers from its reduced functionality.
There also exist many other distributed data processing systems that go beyond the MapReduce framework. These systems have been designed to address various problems not well handled by MapReduce, e.g., Dremel for Interactive analysis, GraphLab for Graph analysis, STORM for stream processing, Spark for memory computing.
The big data presents us the challenges and opportunities in designing new data processing systems for managing and processing the massive data. The potential research topics in this field lie in all phases of data management pipeline that includes data acquisition, data integration, data modeling, query processing, data analysis, etc. Besides, the big data also brings great challenges and opportunities to other computer science disciplines such as system architecture, storage system, system software and software engineering.
Big data: the driver for innovation in databases. National Science Review,Volume 1, Issue 1,Pp. 27-30. doi:10.1093/nsr/nwt020
Science China Press
Space Technology News - Applications and Research
|The content herein, unless otherwise known to be public domain, are Copyright 1995-2014 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. Privacy Statement All images and articles appearing on Space Media Network have been edited or digitally altered in some way. Any requests to remove copyright material will be acted upon in a timely and appropriate manner. Any attempt to extort money from Space Media Network will be ignored and reported to Australian Law Enforcement Agencies as a potential case of financial fraud involving the use of a telephonic carriage device or postal service.|