SQL, Programming, and Big Data: An Opinion

SQL, Programming, and Big Data: An Opinion
by Michael Blaha

One cannot help but be impressed with the accomplishments of Big Data. Google, Facebook, Amazon, and the like are handling massive quantities of unstructured data and gleaning insights. Such volumes, velocity, and variety of data are well beyond the reach of conventional relational databases. It’s been proven that Big Data can deliver business value. Big Data is clearly not a fad.

Nevertheless, there is a seamy underbelly to Big Data that seldom receives mention. Big Data is the new, shiny, sexy data management toy. Many developers find it more appealing than the staid conventional relational databases. Too often, developers misapply Big Data to problems where a relational database is better suited.

The root cause is that many programmers continue to be uncomfortable with relational databases. I see the unease when I work on consulting projects. I see the evidence when I reverse engineer legacy databases and find flawed implementations.
Big Data gives programmers a new excuse for avoiding relational databases and doing something else.

Relational databases require that you express intent with the non-procedural SQL language and let the database engine choose algorithms. To realize applications you must couple the declarative SQL language to imperative programming.
There’s two different paradigms to understand, they fit together awkwardly, and many programmers dislike the combination.
I saw that years ago in the past and I continue to see it today.

In contrast, Big Data products have an imperative paradigm. Developers write procedural logic to leverage their capabilities and deliver a solution. It is a much more familiar approach to programmers than the mixed paradigms of relational database applications.

It is important for the database community to recognize the dual motives for Big Data’s popularity. One motive is laudable – Big Data offers a creative solution for handling the vast quantities of data at the bleeding edge of the profession.
Relational databases cannot cope with these demands. The other motive is discouraging – Big Data offers an excuse for avoiding relational databases and a misguided alternative. These lame applications of Big Data often fail because they are architecturally unsound and technically unjustified.

Michael Blaha is a consultant, author,and trainer who specializes in conceiving, architecting, modeling, designing and tuning databases. He has worked with dozens of organizations around the world. Blaha has authored seven U.S. patents, seven books many articles, and two video courses. His most recent publication is the Agile Data Warehouse Design video course from O’Reilly. He received his doctorate from Washington University in St. Louis, and is an alumnus of GE Global Research in Schenectady, New York. You can find more information with his LinkedIn profile or at superdataguy.com.

You may also like...