Data Analytics – Is Sentiment Analysis Valid?

Data Analytics – Is Sentiment Analysis Valid?

By Richard J Self, University of Derby

Sentiment Analytics of social media sources data has become an area of significant activity in many organisations.
It is claimed to provide valuable insights into customer perceptions of the quality of service, likes and dislikes, new product development etc.
Sources such as Facebook, Twitter, TripAdvisor, to name but three, are becoming primary sources of such data.
Complex analytics processing is then carried out in order to attempt to understand customer perceptions.
There appears to be a general assumption that the postings and tweets are a created by representative sample of the customers/clients.

Conference presentations by practitioners provide a few caveats about the problems of understanding the semantics of natural language, particularly in the short forms used by Twitter, with specific reference to the problems of detecting irony.

One problem that I have not heard expressed very often, is the problem of representativeness.

I was reminded of this recently by a local restaurant owner, who was observing that the only postings on TripAdvisor about his restaurant were almost always negative; it was very rare to find any positive comments. Given that this is a truly excellent little restaurant, this gave me cause to thing about a previous existence of mine when I taught in the Business School.

A key proposition in Marketing and Sales academic research and theory is summarised in the adage that:
• A satisfied customer will tell three people about their good experience.
• A dissatisfied customer will tell ten people about their dissatisfaction

My friend’s experience clearly confirms that this behaviour pattern is still dominant. Indeed, it is becoming even more significant. Some of his customers will no longer discuss any staff related issues face to face, they will only post it on-line in as virulent a style as possible.

I have heard of one very large organisation that does recognise this problem and use Social Media postings as an early warning mechanism for potential customer service problems, which seems to be excellent practice.
I once carried out an analysis of the postings to a bulletin board that I was a member of. I was unable to discover the overall size of the membership, but I was able to identify a total of about 1000 people who had posted at least one item during a period of about 5 years. Overall, five people posted approximately 80% of all postings, the next thirty or forty posted the next 10% of postings. This is a remarkably skewed level of activity, which can be found in almost any area of human activity. Part of the problem is that of the so-called “lurkers”, the people who sit on the side-lines and do not contribute, sometimes also called the “free-loaders”. One of the other parts of the problem is that of it appears that most people do not take the time to contribute, even when they disagree with the posted sentiments.
The governance implications are clear.

Social Media sentiment analysis should be used very carefully in the full and explicit recognition that the sample is from a very small proportion of the customer base; that it is self-selected and highly un-representative of the majority of the customers or clients of an organisation.
• It may be useful as an early warning of developing problems.
• It is unlikely to be a particularly valid source of significant insights into future, successful products.
• It is unlikely to provide any significant evidence of good service

Therefore, from an Information and Corporate Governance perspective, it is important to ask the following questions:-
• What proportion of our customer base is actively involved in the social media?
• What is the distribution of activity amongst the posting sources (individuals)?
• How representative are the various sentiments in terms of the overall customer base?
• What can be learned from both positive and negative sentiment postings?
• How quickly do the sentiments change?

These questions connect to the following V’s of Big Data analytics:
• Validity
• Variability
• Volatility
• Value

You may also like...