Predictions from the Front Lines of Big Data: Three Things That Will Change in 2015
by Walter Maguire, Chief Field Technologist, HP Big Data Group
The Hadoop signal-to-noise ratio will begin to improve
Many of us in the big data world watched Hadoop emerge on the scene and quickly grow from a curiosity to a movement. Tens of thousands of organizations have downloaded it and are working with it today – a pretty impressive achievement! The term has become almost synonymous with big data, with vendors claiming it’ll handle everything from replacing a SAN to replacing a data warehouse to serving as a streaming query engine, reporting platform, data science platform, ETL platform, etc. In fact, the noise in the marketplace makes it seem as if Hadoop will solve every big data problem – which has been a source of confusion to many organizations as they try to separate fact from fiction.
In 2015, I believe we will see clearer signals of just where Hadoop does (and does not) fit in the enterprise. Gartner recently published a great survey by Merv Adrian which provides very solid insights into the reality of Hadoop uptake. Other surveys are starting to appear as well. Moreover, the fact that so many organizations are using Hadoop today means that they’re increasingly willing to talk about where it fits in to their architecture.
So in 2015 it will become easier for organizations to sort out the reality of Hadoop versus the hype.
Unstructured will become the new structured
While many businesses are still working to cope with familiar types of big data which we usually think of as “structured” – ERP data, CRM data, etc. – the segment of organizations who are already on their big data journey are looking for new ways to continually improve their insights. As big data platforms continue to improve, for the first time many organizations are finding it possible to analyze data traditionally thought of as “unstructured” – audio, video, images, documents, etc. And while it’s been possible for some time to perform stand-alone analysis on this data with such things as facial recognition, or topic extraction, it is now possible to derive structure from this data which can be combined with other structured data to generate new and unique insights.
This will prove transformational. Retailers will find that they can use video of in-store traffic to derive insights about who is (or isn’t) buying their merchandise. Online businesses will find that they can automate the tagging of images to dramatically improve their customer experience. The possibilities here are huge, with about 97% of enterprise unstructured data now machine-readable, there is an ocean of information ready for analysis.
In 2015, look for unstructured data to become a hot spot in big data.
Enterprise readiness will matter
Three years ago, my colleagues and I hosted a series of meetups around the San Francisco Bay Area. One topic always on the agenda was a quick overview of Vertica. I’m a fan of simple messages, so I grouped the features & functions into three “buckets”. Two were about analytics, scale & performance. The third was about enterprise readiness such as security, backup & recovery, etc. As soon as I switched to this topic, I’d see the smartphones come out of pockets. I even got a few eye-rolls and snickers. For many in the crowd, risk mitigation was just old school stuff that no longer applied.
We’ve had ample evidence to the contrary over the last few years with massive data breaches, data failures, data thefts, etc. not just from one or two businesses, but dozens. Hundreds. Thousands. Today, when I transition to the enterprise readiness topic, the smartphones get put down and the audience listens.
In 2015, enterprise features (such as security) will be increasingly important to businesses as they build big data systems.