Beyond the Molecule and Beyond the Device: Machine Learning and the Future of Healthcare

By Shomit Ghose, Onset Ventures

– August 2017

“The best minds of my generation are thinking about how to make people click ads. That sucks.” — Jeffrey Hammerbacher, Facebook


Hammerbacher’s Lament

Are we fated to suffer Hammerbacher’s Lament, with our best minds, and our best data, solely serving the gods of commerce?  What of serving mankind’s existential needs in an area as fundamental as healthcare?  With populations increasing and resources decreasing, universal access to impactful healthcare will be one of society’s greatest future challenges, both in the developed world and in the developing world.  But broad healthcare access can be given only through the provision of simple, low-cost means of care.

The combination of Big Data and machine learning, frequently alloyed with behavioral economics, has enabled business model efficiencies that have revolutionized traditional industries from transportation, to retail, to financial services.  Big Data’s disruptive impact on the healthcare industry promises to be just as profound, and its impact on human well-being far greater.

Until now, the provisioning of healthcare has been physically limited: by the reach of our physicians, our pharmaceuticals, and our medical devices. But what if diagnoses could be performed remotely, scalably and cheaply? What if therapies or preventive treatments could be delivered in the same way? To the extent that portions of our healthcare system can be virtualized, medical diagnosis, prevention, and treatment can be delivered cost-effectively and universally via digital means. Data-driven healthcare will positively impact not only a large segment of the human population, but also disrupt the business models of major corporations trading in the medical technology sector today.

Why Now?

Just as in other industries, healthcare has long been awash in data: EMR data, clinical data, pharmaceutical data, genomic data, and claims/cost data. Delivering on the promise of this data has been a challenge. But just as in other industries, two forces have emerged that are now enabling the evolution of healthcare into the new data-driven age. The first of these is machine learning. By harnessing the essentially limitless amounts of computing and data storage in the Cloud, machine learning can correlate and find insights in even the largest and most intractable sets of unrelated data. Indeed, Internet juggernauts such as Google, Amazon and Facebook have built their businesses on top of ever larger collections of data that are, to outward appearances, completely opaque.

The second force is the ubiquity of the mobile phone, with some 5 billion1 now in use worldwide, of which nearly half2 are smartphones. Not only do mobile phones throw off a stream of real-time healthcare-related data – data related to everything from physical activity, to sleep, to demographics, location, and social communication – but they also provide the best platform for connecting with the individual healthcare consumer. And as mobile-driven data sources and volumes multiply, machine learning becomes the sole means by which value can be unlocked from those oceans of data.

The real-time behavioral data spilling from billions of mobile phones gives us, for the first time, high-volume phenotypic data sets that can be correlated with the traditional, more static stores of healthcare data. Machine learning gives us the means to tie together and understand that data, and mobile phones the means to act on it. And because they’re data-driven, these diagnoses and therapies can be applied at population scale, independent of geography, and at scant cost.

From Leeches, to Scalpels… to Data

The potential of data within healthcare is already being demonstrated in academic settings. In the area of population health, the digital markers for neighborhood-level well-being3 and heart disease mortality4 have been found via analysis of Twitter feeds. Within pharmaceuticals, machine learning and statistical analytics have also shown their utility in drug targeting5, monitoring drug safety6, and discovering drug-to-drug interactions7.

On an individual basis, statistical analyses of Web search logs have helped predict the diagnosis of pancreatic cancer8. Machine learning has also been used to model the progression of chronic diseases9, classify skin cancer10, predict cardiac arrest11 and outcomes in pulmonary hypertension12, and even estimate the pain experienced by sheep13. Data-driven analytics have also been used to gauge physical aging14 based on facial images, and in the diagnosis of Down Syndrome15. The advent of portable biosensors16 now promises even greater volumes of fine-grained, individualized information that can be leveraged for real-time diagnosis17 and ongoing therapy.

In addition to diagnosing physical conditions, particular promise for data-driven diagnoses may lie in all areas related to mental health. Machine learning and statistical analytics have been applied to the detection of depression 18, 19, suicide risk20, and the cognitive impairment of Alzheimers21 disease.

Digital healthcare promises a revolution not just by bringing scalability to medical diagnosis. The availability of smart phones, combined with commodity virtual reality (VR) headsets, is bringing scalability to medical treatment as well. Virtual reality and digital therapies have proven effective in treating pain22, brain damage23, hospitalization anxiety24, vertigo25, autism26, PTSD27, phobias28, and the prevention of intrusive memories29 following psychological trauma. Further, advances in data-driven chatbot30 technology now enable the provisioning of automated conversational agents31 to assist people suffering mental health issues32 such as anxiety and depression.

Ripe for Disruption

Companies who do appreciate the power of Big Data and machine learning, as best exemplified by Google, Amazon and Facebook, have disrupted and decimated legacy industries across the US economy. The healthcare market, representing almost 18% of US GDP in 201533, can hardly have escaped their attentions. (Amazon and Google have massive back-end machine learning power. What happens when you connect this with the massive front-end data collection power from Alexa, Echo or Google Home devices in every room?) Incumbent players in the healthcare industry ignore the coming assault from data-driven competitors at their peril.

Disrupting the healthcare industry through data is first and foremost a business issue and not a technology issue. A pharmaceutical manufacturer of insulin, for example, can continue to view its business entirely through the lens of its diabetes drug. Alternately, the company could also bring a complementary virtual drug to market to help those at risk of developing Type II diabetes. Such a drug would consist entirely of digital interventions in modifiable environmental risk factors such as diet34, drink35, sleep36 and exercise37. Unlike the physical drug, the virtual drug can be administered to patients anywhere, adds little cost to potentially under-resourced healthcare systems, but yields meaningful impact38 to society.

The adoption of data-driven healthcare solutions allows pharmaceutical companies to “think beyond the molecule”. In addition to broadening market footprint through virtual drugs, pharmaceutical companies will be able to, for the first time, engage directly with the patients they serve, and not just with the physicians. In the same vein(!), medical device companies can “think beyond the device” and complement traditional electro-mechanical device products with virtual devices that are assembled from the cloud of data that surrounds us all. The United States Food and Drug Administration has already begun steps39 to help realize this future.

The combination of Big Data, machine learning and mobile phones promises to finally bring cost-effective healthcare to the broadest segment of the human population; foundational research in the area has already revealed the possibilities. Our best minds can indeed be deployed to solve our biggest problems. Our best data can indeed be used to improve the quality of human lives, and not just impel us to click the “Buy Now!” button. Our challenge is to discover the applications that will make our health data actionable. Our opportunity is to better human existence through broad delivery of healthcare that is predictive, preventive, personalized and participatory.

I predict that a future Nobel Prize in Medicine will be won by a data scientist for helping solve one of mankind’s greatest healthcare challenges. You heard it here first, folks.


  1. Kemp, Simon. “The global state of the internet in April 2017”. The Next Web. April 2017.
  2. Murphy, David. “2.4BN Smartphone Users in 2017, Says eMarketer”. MobileMarketing. April 2017.
  3. Nguyen QC, et al. “Building a National Dataset From geotagged Twitter Data for Indicators of Happiness, Diet, and Physical Activity”. October 2017.
  4. Eichstaedt, Johannes, et al. “Psychological Language on Twitter Predicts County-Level Heart Disease Mortality”. Psychological Science. 2015.
  5. Chekroud A., Gueorguieva R., Krumholz H. “Reevaluating the Efficacy and Predictability of Antidepressant Treatments: A System Clustering Approach”. JAMA Psychiatry. April 2017.
  6. Zhao J., Henriksson A., Asker L., Boström H. “Predictive modeling of structured electronic health records for adverse drug event detection”. 2015.
  7. White RW, Tatonetti NP, Shah NH, Altman RB, Horvitz E. “Web-scale pharmacovigilance: listening to signals from the crowd”. May 2013.
  8. Paparrizos J., White RW, Horvitz E. “Screening for Pancreatic Adenocarcinoma Using Signals from web Search Logs: Feasibility Study and Results”. Journal of Oncology Practice. August 2016.
  9. Wang X., Sontag D., Wang F. “Unsupervised Learning of Disease Progression Models”. 2014.
  10. Esteva A., Kuprel B., Novoa RA, et al. “Dermatologist-level classification of skin cancer with deep neural networks”. Nature. January 2017.
  11. Somanchi S., Adhikari S., Lin A., Eneva E., Ghani R. “Early Prediction of Cardiac Arrest (Code Blue) using Electronic Medical Records”.
  12. Dawes TJW, de Marvao A., Shi W., et al. “Machine Learning of Three-dimensional Right Ventricular Motion Enables Outcome Prediction in Pulmonary Hypertension: A Cardiac MR Imaging Study”. Radiology. May 2017.
  13. Lu Y., Mahmoud M., Robinson P. “Estimating Sheep Pain Level Using Facial Action Unit Detection”. Cambridge University. 2017.
  14. Chen W., Qian W., Wu G., et al. “Three-dimensional human facial morphologies as robust aging markers”. Cell Research. March 2015.
  15. Kruszka P., Porras AR, Sobering AK, et al. “Down syndrome in diverse populations”. 2017.
  16. Miyamoto A., Lee S., Cooray NF, et al. “Inflammation-free, gas-permeable, lightweight, stretchable on-skin electronics with nanomeshes”. Nature. July 2017.
  17. Li X., Dunn J., Salins D., et al. “Digital Health: Tracking Physiomes and Activity Using Wearable Biosensors Reveals Useful Health-Related Information”. PLOS. January 2017.
  18. Kotikalapudi R., Chellappan S., Montgomery F., Wunsch D., Lutzen K. “Associating Internet Usage with Depressive Behavior Among College Students”. IEEE Technology and Society Magazine. Winter 2012.
  19. Reecea AG, Danforth CM. “Instagram photos reveal predictive markers of depression”.
  20. Kelion, Leo. “Facebook artificial intelligence spots suicidal users”. BBC. March 1, 2017. Online.
  21. Fiore, Kristina. “Speech Changes May Signal Cognitive Impairment”. MedPage Today. July 17, 2017.
  22. McCarthy, Michael. “Virtual-reality games reduce pain. Now to answer why.” University of Washington NewsBeat, May 28, 2015.
  23. Rose FD, Brooks BM, Rizzo AA. “Virtual Reality in Brain Damage Rehabilitation: Review.” CyberPsychology & Behavior. Volume 8, Number 3, 2005.
  24. Mosadeghi S., Reid MW, Martinez B., Rosen BT, Spiegel BM. “Feasibility of an Immersive Reality Intervention for Hospitalized Patients: An Observational Cohort Study.” 2016.
  25. Pavlou M., Kanegaonkar RG, Swapp D., et al. “The effect of virtual reality on visual vertigo symptoms in patients with peripheral vestibular dysfunction: a pilot study.” 2012.
  26. Kandalaft MR, Didehbani N., Krawczyk DC, Allen TT, Chapman SB. “Virtual Reality Social Cognition for Young Adults with High-Functioning Autism”. Journal of Autism Development Disorders. 2012.
  27. Rizzo A., Hartholt A., Grimani M. “Virtual Reality Exposure Therapy for Combat-Related Posttraumatic Stress Disorder”. IEEE Computer. July 2014.
  28. Virtual Reality Treatment Program, Duke University. Online. Accessed August 17, 2017.
  29. Iyadurai L., Blackwell SE, Meiser-Stedman R., et al. “Preventing intrusive memories after trauma via a brief intervention involving Tetris computer game play in the emergency department: a proof-of-concept randomized controlled trial.” Molecular Psychiatry. March 2017.
  30. Fitzpatrick KK, Darcy A., Vierhile M. “Delivering Cognitive Behavior Therapy to Young Adults With Symptoms of depression and Anxiety Using a Fully Automated Conversational Agent (Woebot): A randomized Controlled Trial.” 2017.
  31. Online. Accessed August 17, 2017.
  32. D’Alfonso S., Santesteban-Echarri O., Rice S., et al. “Artificial Intelligence-Assisted Online Social Therapy for Youth Mental Health”. Frontiers in Psychlogy. June 2017.
  33. United States Centers for Medicare & Medicaid Services web site. Accessed August 17, 2017.
  34. Klonoff DC. “The Beneficial Effects of a Paleolithic Diet on Type 2 Diabetes and Other Risk Factors for Cardiovascular Disease”. Journal of Diabetes Science and Technology. November 2009.
  35. Jørgensen ME, Grønbæk M., Tolstrup JS. “Alcohol drinking patterns and risk of diabetes: a cohort study of 70,551 men and women from the general Danish population”. Diabetologia. July 2017.
  36. Knutson KL. “Impact of sleep and sleep loss on glucose homeostasis and appetite regulation”. June 2007.
  37. Qiu S., Cai X., Schumann U., et al. “Impact of Walking on Glycemic Control and Other Cardiovascular Risk Factors in Type 2 Diabetes: A Meta-Analysis.” PLOS One. October 17, 2014.
  38. Global Report on Diabetes. World Health Organization. 2016.
  39. Gottlieb S. “Fostering Medical Innovation: A Plan for Digital Health Devices.” U.S. Food & Drug Administration. June 15, 2017.

You may also like...