Looking back at Big Data in 2015

Looking back at Big Data in 2015
By Cynthia M. Saracco, IBM Senior Solution Architect

As 2015 draws to a close, I find myself occasionally looking back at the year’s highlights and reflecting on what’s changed — and what hasn’t — in the world of Big Data. Here, in no particular order, are several conclusions I’ve reached based on my work at IBM and my interactions with customers, partners, and other third parties this year. While my views are admittedly biased by these activities, perhaps one or more of these topics will resonate with you.

  • IBM bets big on Big Data and analytics. Starting with a major corporate reorganization in January that created a new Analytics business unit, IBM began boosting its investment in Big Data and analytics through acquisitions, alliances, and technology development.   Examples include its plans to acquire The Weather Company and use weather data for industry-specific analytic solutions, its delivery of a new cloud service for Twitter data (which can be leveraged from Watson Analytics), and a new architecture for BigInsights, its Big Data platform based on Apache Hadoop, which now features separate offerings for data analysts, data scientists, and system administrators atop a foundation of common open source components.
  • Solution users and providers form alliance to advance Hadoop. While initial adopters of Hadoop tolerated the technology’s inevitable growing pains in its early days, this year brought increased emphasis on stability and compatibility through the formation of ODPi, an industry effort to define, test, and certify a core set of Big Data open source projects. The initial focus of this effort involves Apache Hadoop (including HDFS, YARN, and MapReduce) and Apache Ambari. Members of ODPi include end users and solution providers, including IBM.
  • The allure of Spark grows among potential users. Although interest in Apache Spark isn’t new, this year marked a period of heightened interest.
    Spark’s performance characteristics and popular built-in libraries (for machine learning, streams computing, SQL access, and other areas) haven’t escaped prospective users. Indeed, at nearly every customer briefing and workshop I delivered, people wanted to hear something about Spark. Some members of the trade press and analyst community noticed this trend, too. Spark became a key Big Data initiative for IBM this year, as evidenced by the recent opening of its Spark Technology Center in San Francisco, active technical participation in Spark-related efforts, and recent donation of its System ML technology to Spark.
  • Big Data skills remain in short supply. Finding data scientists and Big Data specialists (including Hadoop and Spark programmers) continues to be a tough task for hiring managers. Some firms are investing in internal training efforts (e.g., publishing best practices and sample code on intranet sites), while others are turning to competitive hires or service providers to fill the gaps. Fortunately, the gap hasn’t gone unnoticed by students and software professionals, who seem enthusiastic about enrolling in online courses, attending MeetUps and industry conferences, experimenting with free Hadoop or Spark downloads, etc.   For example, membership in Big Data Developer MeetUps more than tripled this year and the average number of daily visits to IBM’s web site for Hadoop developers increased more than 50%.

What’s in store for Big Data? Most likely, technologies in this space will continue to evolve rapidly, and early adopters will continue to push the boundaries of what’s possible, driving vendors to do the same. Ultimately, though, the most effective Big Data technologies will be those that enable their users to quickly and easily analyze and derive value from the wide range of internal and public domain data available today.

You may also like...