Models for Big Data

Models for Big Data
HPCC Systems white paper

This paper explores data models used for big data processing and shows how the preferred technology is one that can flexibly move between models.

Table of Contents

Models for Big Data 

Structured Data  

Text (and HTML)

Semi-Structured Data

 Bridging the Gap – The Key – Value pair

 XML – Structured Text?

RDF

 Data Model Summary

Data Abstraction – An Alternative Approach

Structured Data

 Text

Semi-Structured Data

 Key-Value Pairs

 XML

RDF

Model Flexibility in Practice

 Conclusion 

Models for Big Data

The principal performance driver of a Big Data application is the data model in which the Big Data resides. Unfortunately most extant Big Data tools impose a data model upon a problem and thereby cripple their performance in some applications1. The aim of this paper is to discuss some of the principle data models that exist and are imposed; and then to argue that an industrial strength Big Data solution needs to be able to move between these models with a minimum of effort.

As each data model is discussed various products which focus upon that data model will be described and generalized pros and cons will be detailed. It should be understood that many commercial products when utilized fully will have tricks, features and tweaks designed to mitigate some of the worst of the cons. Notwithstanding, this paper will attempt to show that those embellishments are a weak substitute for basing the application upon the correct data model.

Sponsored by HPCC Systems 

You may also like...