5 Numbers That Matter for Big Data Today
5 Numbers That Matter for Big Data Today
BY Carole Gunst, Marketing Director, Attunity
Every second of every day, there is an exorbitant amount of data being created through each and every connected device in the world. Connected technology and big data in general are leveraged in a wide variety of industries – from financial services and healthcare, to manufacturing, retail, and many others, to provide insights for a wide variety of use cases. In this piece, we’ll take a close look at the effects of big data on organizations today in the most straight-forward way possible – by the numbers.
IDC predicts that by 2020 the amount of data we create and replicate annually – will reach 44 zettabytes, or 44 trillion gigabytes. Like the physical universe, data is diverse – it’s created by everyone typing emails, using digital cameras, sharing tweets and posts on social media sites, and by the millions of connected “things” sending and receiving data over the Internet. IT organizations are going to have to increase security to protect it and find new places to store it whether it be in a data warehouse, the cloud or Hadoop. Over the next few years, IT organizations will also help enterprises become more data- and software-driven through partnering with the C-suite for more strategic business decisions.
The RTBlog reports “Nearly half of CEOs believe that all of their employees have access to the data they need, but only 27% of employees agree.” This factoid comes from research sponsored by Teradata. The company commissioned The Economist Intelligence Unit to survey 362 workers across the globe and across departments. The research revealed CEOs are overestimating how quickly “big data” moves through their company, with 43% of CEO respondents believing that relevant data is made available in real-time, compared to 29% of all respondents.
So how do you turn your company into a successful data-driven company? The report says it all boils down to two things – top-down leadership and bottom up engagement. As more data management tools enter the enterprise to make it easier to analyze and process data, operations are more streamlined and IT departments are freed up to work more strategically with business leaders on important decisions. Through these partnerships, the C-Suite have access to vital data to help drive business operations.
According to Gartner, 90 percent of large companies will have a Chief Data Officer (CDO) by 2019. Their research also shows the rapid adoption of the Chief Data Officer (CDO) role – from 400 in 2014 to 1,000 in 2015 – raises important questions about the structure and positioning of the office of the CDO within organizations.
So, what does a CDO do? Experian Data Quality recently conducted a research study of more than 250 Chief Information Officers (CIOs) and Chief Data Officers (CDOs) in the US about their data management practices, and the explosion of the CDO role. Their research showed that “These individuals serve as not just guardians of data within organizations, but also as evangelists. This new C-suite role will be crucial to taking big data trends from theory to reality as it shows organizations are not only investing in data management, but also demonstrating that data is playing a larger role in business operations today.” By integrating data into a C-suite position, companies are taking a more proactive role in both managing data for enhanced business operations and leveraging it for strategic business decisions.
Hadoop is 10 years old this year! The first Hadoop cluster was put into production at Yahoo 10 years ago and is now used by numerous companies around the world to store data. Hadoop was created by Doug Cutting and Mike Cafarella who split the distributed file system and MapReduce facility from their open- source Web crawler project (Apache Nutch) and spun it off as a subproject called Hadoop. This subproject was named after Cutting’s son’s stuffed elephant toy.
The term Hadoop has come to refer not just to an open-source software framework for distributed storage and processing of very large data sets, but also to the ecosystem of additional software that can be installed on top of or alongside it. Over the last decade, there have been several welcomed additions to the Hadoop project arena including Apache Pig, Apache Hive, Apache HBase, Apache Phoenix, Apache Spark, Apache ZooKeeper, Cloudera Impala, Apache Flume, Apache Sqoop, Apache Oozie, Apache Storm, and Kafka.
150,000,000 (or even a little more) emails are sent in an Internet minute. The first email sender back in 1971, Ray Tomlinson, may have never thought we’d be sending out email in that volume. His initial email was sent between two computers that were actually sitting beside each other that were connected through the ARPANET network. Ray Tomlinson is quoted as saying he invented email, “Mostly because it seemed like a neat idea.” No one was asking for email at the time, but it’s amazing what it has become. Email has revolutionized the way we communicate across businesses today. It creates an abundance of data which in turn has helped to forge the IT career path.
Together, all of these numbers prove that data has dramatically changed the way we operate in our day- to-day business environments. Data has made our decisions more strategic, our operations more streamlined and our overall world more intuitive.
Sponsored by Attunity