WHY DATA SCIENCE NEEDS STORY TELLING
BY Steve Lohr, technology reporter for the New York Times
It often seems a waste that so many of the finest minds in computer science have dedicated themselves to improving the odds of making a sale, applying the tools of data science to tailored marketing, targeted advertising and personalized product recommendations.
But another way to look at things is that we’re seeing one stage in the life cycle of an emerging science. Marketing is a low-risk – and, yes, lucrative – petri dish to hone skills.
That point was made most succinctly to me a while back by Claudia Perlich, who emigrated from then-East Germany for advanced degrees in America, joined IBM’s Watson labs, and has won a several professional data-mining competitions. She left IBM and is now chief scientist at an ad-targeting start-up in New York, Dstillery.
When I asked her why advertising, Perlich replied that marketing is an ideal arena to conduct real-world experiments in data science.
“What happens if my algorithm is wrong? Someone sees the wrong ad,” she said. “What’s the harm? It’s not a false positive for breast cancer.”
But the stakes are rising as the methods and mind-set of data science spread across the economy and society. Big companies and start-ups are beginning to use the technology in decisions like medical diagnosis, crime prevention and loan approvals.
Take consumer lending, a market with several big data start-ups. Its methods amount to a digital-age twist on the most basic tenet of banking: Know your customer. By harvesting data sources like social network connections, or even by looking at how an applicant fills out online forms, the new data lenders say they can know borrowers as never before.
The promise is more efficient loan underwriting and pricing, saving millions of people billions of dollars. But big data lending depends on software algorithms poring through mountains of data, learning as they go. It is a highly complex, automated system — and even enthusiasts have qualms.
“A decision is made about you, and you have no idea why it was done,”
said Rajeev Date, an investor in data-science lenders and a former senior official in Consumer Financial Protection Bureau, a watchdog agency in the United States.
“That is disquieting.”
Clearly, some mechanism of machine-to-man translation is needed as data science advances. Danny Hillis, an artificial intelligence expert, observed,
“The key thing that will make it work and make it acceptable to society is story telling.”
Not so much literal story telling, but more an understandable audit trail that explains how an automated decision was made.
Indeed, there is a lot of talk of “story telling” in the field these days. It’s an appealing metaphor, with its implicit promise of giving voice to the algorithmic automatons. It will take a lot of hard work and real science to make that vision of story telling more than a comforting illusion.
But some fine minds are working on that problem too. One of them, Kris Hammond, a computer scientist at Northwestern University, emphasized,
“Technologists have to bring what they know into the world in a way people can understand. It’s a sin not to.”