Data Science and the Missing Link
Therefore dear readers, especially data science readers, just when you thought that you had built a good model that predicts consumer behaviour, think again.
By Con Menictas
Data science is quickly evolving into an actual discipline, with many universities now offering degrees in data science. Over the years we have seen vast improvements in the way that both analysts and data scientists alike can now source data, prepare data and ultimately model and score data with algorithmic solutions that can be programmed in a vast number of languages directly into a database.
All of this has brought us closer to being able to predict what humans will do next. In the case of consumer behaviour, transaction data can tell us a lot. Using the right predictive model can produce propensities of a consumer’s next steps.
As a result, a lot of the hype in data science comes from predicting people’s behaviours, especially in the case of marketing the right message to consumers.
Due to the large profits that can be made from the commercial application of data science in predicting a consumer’s next product purchase, or volumetrics or even loyalty, companies are spending big on data science. They go out of their way to attract the best data scientists, all the way through to providing leading edge analytical platforms for data scientists to use.
Without seeming to suggest that data science needs help in predicting consumer behaviour, it would be remiss of us not to admit that data science usually omits one very important component in predicting consumer behaviour. Transaction and behavioural data tell us what consumers have done, but they don’t tell the reason consumers behaved the way they did.
In addition, transaction and behavioural data in the main tell us nothing about the choice set that was available at the time the consumer chose their product or service! Data science as a discipline has its head so far down the analytical path that either forgets to acknowledge that knowing the reason for the behaviour that was observed and the heuristics that took place to navigate the available choices that ultimately led to a choice, are critical missing links to model improvement.
Worse even still, is when data science is not even aware that it should know, which is in the majority of cases!
I am talking about market research. Knowing the reasons for consumer actions and the choice set from which utility maximization took place to determine the optimal choice, are very important elements that the data science world needs to understand and adopt if it wants to improve prediction of consumer behaviour. Also, knowing how to conduct effective market research and experiments needs to be learnt and understood, but I think that the need to learn about these two components would be taken seriously. Even more importantly, knowing how to adopt market research findings into the model is another world altogether that I suspect would be deemed superfluous.
Therefore dear readers, especially data science readers,just when you thought that you had built a good model that predicts consumer behaviour, think again.
Omitted variables bias is not a concept, it is real and to that extent we must be open to the fact that having data on the reasons for consumer actions and the choice set are critical data that should be adopted for model improvement.
The Nobel Prize in Economics in 2000 went to Daniel McFadden‘s for his development of theory and methods for analysing discrete human choice, the stuff we use in market research. The Nobel in Data Science went to?