Prof. Joseph M. Hellerstein, UC Berkeley
Joseph M. Hellerstein is a Chancellor’s Professor of Computer Science at UC Berkeley, and the co-founder and CEO of Trifacta. Hellerstein’s work is in the broad area of data-centric systems and the way...
Operational Database Management Systems
Joseph M. Hellerstein is a Chancellor’s Professor of Computer Science at UC Berkeley, and the co-founder and CEO of Trifacta. Hellerstein’s work is in the broad area of data-centric systems and the way...
MapReduce-MPI Library MapReduce-MPI (MR-MPI) library is an open-source implementation of MapReduce written for distributed-memory parallel machines on top of standard MPI message passing. The MR-MPI library was developed at Sandia National Laboratories, a US...
An empirical comparison of graph databases Salim Jouili,Eura Nova R&D, 1435 Mont-Saint-Guibert, Belgium Valentin Vansteenberghe, Universite Catholique de Louvain, 1348 Louvain-La-Neuve, Belgium Abstract—In recent years, more and more companies provide services that can not...
Jianfeng Zhan is a Professor of Computer Science and Engineering at Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences. He is Deputy Director of Computer Systems Research...
Concurrency Control and Recovery in Database Systems Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman This page offers a free download of the above book in PDF file format. You can read or print it...
AsterixDB Big Data Management System (BDMS) Overview (last updates October 2014) The AsterixDB BDMS is the result of over four years of R&D involving researchers at UC Irvine, UC Riverside, and UC San Diego....
Fine-grained Partitioning for Aggressive Data Skipping Modern query engines are increasingly being required to process enormous datasets in near real-time. While much can be done to speed up the data access, a promising technique...
A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data In emerging Big Data scenarios, obtaining timely, high-quality answers to aggregate queries is difficult due to the challenges of processing and cleaning...
BIG DATA: SEIZING OPPORTUNITIES, PRESERVING VALUES  Executive Office of the President The White House Washington MAY 2014 May 1, 2014 DEAR MR. PRESIDENT: We are living in the midst of a social, economic,...
SQL-on-Hadoop without compromise IBM Software Group Thought Leadership White Paper How Big SQL 3.0 from IBM represents an important leap forward for speed, portability and robust functionality in SQL-on-Hadoop solutions By Scott C. Gray, Fatma...