Predictive Analytics in Healthcare. Interview with Steve Nathan
“Analysis of big data can identify the subtle differences that explain why similar-seeming patients have different outcomes, and predictive decision support can help physicians guide more patients on the path to recovery”–Steve Nathan.
Why using predictive analytics in healthcare? On this topic, I have interviewed Steve Nathan, CEO at Amara Health Analytics.
Q1. Amara provides real-time predictive analytics to support clinicians in the early detection of critical disease states. Can you tell us a bit more of what are these predictive analytics and which data sets do you use?
Steve Nathan: Our system runs on hospital data, including labs, pharmacy, real-time vitals, ADT, EMR, and clinical narrative text. We have developed domain-specific natural language processing (NLP) and machine learning to find predictive signal that is beyond the reach of traditional approaches to clinical analytics.
Q2. How large are the data sets you analyse?
Steve Nathan: For an average 500 bed hospital we analyse about 100 million data items in real-time annually. We expect this volume of data to increase dramatically as more centers deploy hospital-wide automated real-time streaming of data from patient monitors into the EMR.
Q3. Why early detection is important?
Steve Nathan: Let’s look at the example of sepsis, which is the body’s systemic toxic response to an infection. Clinicians know how to treat sepsis once identified, but it’s often difficult to identify early because it can mimic other conditions and there are a multitude of clinical variables involved, some of which are subjective in nature.
At the later stages of sepsis, mortality increases almost 8% with every hour of delayed treatment. So technology that can assist clinicians in early identification is important.
Q4. What is your back-end platform for real-time data processing and analytics?
Steve Nathan: The major components of the backend are Mirth (open source HL7 parsing/transform), a data aggregator, a real-time data streaming engine, Jess rules engine, the core analytics engine (including NLP pipeline), and Cassandra NoSQL database from DataStax.
Q5 What are the main technical challenges you encounter when you analyse big data sets that are composed of previous patient histories and medical records?
Steve Nathan: Input data comes from a wide variety of centers and is wildly heterogeneous. Issues range from proprietary health IT systems and incompatible usage of available standards like HL7, to wide variations in clinical language used by physicians and nurses in their notes, to varying patterns of diagnostic testing by different providers. And of course every patient is unique. So much of our system is dedicated to producing a standardized timeline for each patient in which every variable has a consistent interpretation, no matter where it originally came from. We then use these patient timelines as input for machine learning, and there are some unique challenges there in developing predictive models that are appropriate for use in real-time decision support.
Q6. Why did you choose a NoSQL database for this task?
Steve Nathan: The primary motivation was flexibility of data representation. We wanted to be able to change schemas dynamically. Also, DataStax’s integration of Cassandra with Solr was very important for us because we do a lot text mining as part of our overall approach.
Q7. What are the lessons learned so far?
Steve Nathan: Processing huge amounts of historical patient data in simulations to refine predictive models is very challenging to do with good performance. We must be able to run many years worth of archived data through the system in a small number of hours, so the system architecture must be designed with this processing mode in mind.
It is important to consider early on how to upgrade the live system with no down time while avoiding the possibility of losing any data.
Q8. What about data protection issues?
Steve Nathan: We comply with HIPAA regulations and go to great lengths to protect data. This includes work processes and policies for all employees, contractors, and business associates. And, importantly, it includes the architecture and deployment of our systems – for example all data is transferred via VPN, and every VPN connection is terminated in a VM that contains a single protected dataset, rather than terminating at a DC router.
Q9. Which relationships exist between Amara and UC San Diego?
Steve Nathan: There is no formal relationship between Amara and UCSD. Amara’s co-founder and chairman, Dr. Ramamohan Paturi, is a professor of Computer Science at UCSD.
Qx. Anything else you wish to add?
Steve Nathan: Analysis of big data can identify the subtle differences that explain why similar-seeming patients have different outcomes, and predictive decision support can help physicians guide more patients on the path to recovery.
Steve Nathan has over 25 years in the enterprise software industry. As CEO of Parity Computing beginning in 2008, Steve led the company’s product expansion with a pioneering analytics platform for enhancing biomedical research productivity. He then led the formation of Amara Health Analytics, the concept and strategy for the Clinical Vigilance™ product line, and the spinout of Amara as an independent company.
Steve has held key leadership positions at recognized technology innovators Sun Microsystems and Cray Research, as well as at start-ups Celerity Computing, Alignent Software, and Exist Global. As General Manager at Sun Microsystems, he had P&L responsibility for messaging, portal, and web infrastructure in the iPlanet business unit, where he grew the annual revenue of this product line from $15M to $150M in three years.
Steve holds B.S. degrees from the University of California at Riverside in Computer Science, Mathematics, and Psychology; and he is a 1999 graduate of the Stanford University Business Executive Program.
Follow ODBMS.org and ODBMS Industry Watch on Twitter: