Monetizing Big Data.
By Patrick Schwerdtfeger.
The key to monetizing ‘Big Data’ is in its predictive capacity. If you can predict an outcome before it takes place, there’s value in that. In what circumstances will someone make a purchase? Who is likely to become ‘radicalized’ and plan terrorist attacks? What will the weather be like tomorrow? Which credit card transactions are most likely fraudulent? Will the stock market rally continue? The answers to these questions have value. People and businesses are will to pay for that.
The goal, then, for any Big Data initiative should be to identify correlations, determine causation and develop algorithms that will churn through live data streams and spit out predictions as they emerge. Of course, the simplicity of this statement is deceptive. In particular, there are three challenges that early adopters of predictive analytics are struggling with: cleaning up unstructured data, determining causation and designing algorithms.
Cleaning Up Unstructured Data
The vast majority of data being accumulated in today’s digital world (perhaps 80% or more) is unstructured.
Much of it is comprised of machine logs and metadata. It’s ugly and disorganized. As a result, people instinctively want to avoid it. They gravitate to the structured data first. That’s fine and there are plenty of profitable insights to be found, but the larger opportunity with far less competition is the unstructured stuff.
Get your executive team together and think about all the stores of data in your organization. It may help to bring in a consultant to help with this process. You probably have more data than you even realize. Then brainstorm different questions that might be answered in all that data. Once the most potentially profitable questions have been identified, it’s time to figure out how to sift through that data and make sense of it all.
Two of the most valuable career options right now are (1) data engineers and (2) data scientists. These are the people who can help you clean up all those unstructured machine logs. Depending on the size of your organization, it might make sense to bring in consultants to do this work as well. Finding human talent is difficult and expensive. Of course, you’ll pay plenty for the consultants too but it’s an on-call arrangement.
There’s a huge and well-documented difference between correlation and causation. It’s quite easy to do correlation analysis on your data and identify which variables are correlated with each other. But which one came first?
Did A cause B? Or did B cause A? Or, in fact, did C cause both A and B? If predictive capacity is what you’re looking for, causation is critically important. Without causation, you have nothing.
Determining causality begins with developing a theory of causation and then testing the relationship in any way you can. Split A/B testing is a perfect way to gain insights into the relationship. It all boils down to experimentation and accumulating empirical evidence supporting one theory over another. It can take time and cost money, but it’s essential if you wish to monetize your data.
Once you have causation nailed down, you can start designing an algorithm. They call it algorithm engineering and yet again, it’s a specialized field. That means it’s going to cost some money. But if you have the data inputs figured out and can use them to generate something of value on the other side, it’s worth the investment. The beauty of algorithms is that they’re essentially annuities. You build them once but can monetize them in perpetuity.
Already today, we have algorithms to anticipate criminal activity, predict weather patterns, allocate investment portfolios, write journalistic updates for sports games, calculate credit scores and measure climate change.
Algorithms are the future. Ten or 20 years from now, they will operate behind every corner of our human experience. Indeed, they will help us optimize Planet Earth.
The cost of storing one petabyte of data in 2010 was roughly $80,000 USD. By 2020, that cost is projected to drop to just $4 USD. Data storage is following an exponential curve. In fact, data storage, data bandwidth and data processing are all following an exponential curve. That means the cost is dropping quickly. What would you do if they were free?
As a leader, you need to plan for tomorrow, not today. What would you do if data storage, data bandwidth and data processing were free? Of course, it will never truly become free but the cost of hardware will continue to drop.
The expensive part is the human expertise. Qualified Big Data talent is in short supply today but it is essential to participate in the predictive revolution.
Exponential curves are deceptively flat at first, but that’s actually the most exciting time. That’s where the opportunities lie. It’s during those sleepy deceptive early stages that market dominance is established. We are in those early stages of Big Data and business intelligence right now and the organizations that aggressively pursue the technology today will reap huge rewards in the years to come.
About Patrick Schwerdtfeger
Patrick Schwerdtfeger is an author and keynote speaker specializing in global business trends including ‘big data’ technologies, shifting demographics and the social media revolution. He earns over 90% of his income in speaking fees and has spoken in dozens of countries around the world.