Survey of Apache Big Data Stack
Survey of Apache Big Data Stack Supun Kamburugamuve For the PhD Qualifying Exam 12/16/2013 Advisory Committee Prof. Geoffrey Fox Prof. David Leake Prof. Judy Qiu 1. Introduction Over the last decade there has being...
Operational Database Management Systems
Survey of Apache Big Data Stack Supun Kamburugamuve For the PhD Qualifying Exam 12/16/2013 Advisory Committee Prof. Geoffrey Fox Prof. David Leake Prof. Judy Qiu 1. Introduction Over the last decade there has being...
BigDataBench As a multi-discipline research effort, BigDataBench is an open-source big data benchmark suite. The current version is BigDataBench 3.0. It includes 6 real-world and 2 synthetic data sets, and 32 big data workloads, covering micro...
PoliTwi: Early Detection of Emerging Political Topics on Twitter and the Impact on Concept-Level Sentiment Analysis Sven Rill, Dirk Reinela, Jörg Scheidt, Institute of Information Systems, University of Applied Sciences Hof, Alfons-Goppel-Platz 1, Hof, Germany...
MapReduce-MPI Library MapReduce-MPI (MR-MPI) library is an open-source implementation of MapReduce written for distributed-memory parallel machines on top of standard MPI message passing. The MR-MPI library was developed at Sandia National Laboratories, a US...
An empirical comparison of graph databases Salim Jouili,Eura Nova R&D, 1435 Mont-Saint-Guibert, Belgium Valentin Vansteenberghe, Universite Catholique de Louvain, 1348 Louvain-La-Neuve, Belgium Abstract—In recent years, more and more companies provide services that can not...
AsterixDB Big Data Management System (BDMS) Overview (last updates October 2014) The AsterixDB BDMS is the result of over four years of R&D involving researchers at UC Irvine, UC Riverside, and UC San Diego....
Fine-grained Partitioning for Aggressive Data Skipping Modern query engines are increasingly being required to process enormous datasets in near real-time. While much can be done to speed up the data access, a promising technique...
A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data In emerging Big Data scenarios, obtaining timely, high-quality answers to aggregate queries is difficult due to the challenges of processing and cleaning...
BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES  Executive Office of the President The White House Washington MAY 2014 May 1, 2014 DEAR MR. PRESIDENT: We are living in the midst of a social, economic,...
SQL-on-Hadoop without compromise IBM Software Group Thought Leadership White Paper How Big SQL 3.0 from IBM represents an important leap forward for speed, portability and robust functionality in SQL-on-Hadoop solutions By Scott C. Gray, Fatma...