Models for Big Data

by Roberto Zicari · February 19, 2016

Models for Big Data

HPCC Systems white paper

This paper explores data models used for big data processing and shows how the preferred technology is one that can flexibly move between models.

Table of Contents

Models for Big Data

Structured Data

Text (and HTML)

Semi-Structured Data

Bridging the Gap – The Key – Value pair

XML – Structured Text?

RDF

Data Model Summary

Data Abstraction – An Alternative Approach

Structured Data

Text

Semi-Structured Data

Key-Value Pairs

XML

RDF

Model Flexibility in Practice

Conclusion

Models for Big Data

The principal performance driver of a Big Data application is the data model in which the Big Data resides. Unfortunately most extant Big Data tools impose a data model upon a problem and thereby cripple their performance in some applications1. The aim of this paper is to discuss some of the principle data models that exist and are imposed; and then to argue that an industrial strength Big Data solution needs to be able to move between these models with a minimum of effort.

As each data model is discussed various products which focus upon that data model will be described and generalized pros and cons will be detailed. It should be understood that many commercial products when utilized fully will have tricks, features and tweaks designed to mitigate some of the worst of the cons. Notwithstanding, this paper will attempt to show that those embellishments are a weak substitute for basing the application upon the correct data model.

(LINK) View The PDF

Sponsored by HPCC Systems

Models for Big Data

You may also like...

Resources

Search

News

Events

Archives

Sponsored By

InterSystems

MySQL/Oracle

Supporters

McObject

Raima

Scality

TIAA

Undo

Volt Active Data