This article is based on a 2020 Healthcare Analytics Summit (HAS 20 Virtual) presentation titled “Machine Learning, Social Determinants, and Data Selection for Population Health” by Terri H Steinberg, MD, MBA, FACP, Chief Health Information Officer, Vice President for Population Health Informatics, ChrisitanaCare, and Jason Jones, PhD, Chief Data Scientist, Health Catalyst.
The use of artificial intelligence (AI) and, specifically, predictive models to identify the most vulnerable patient populations is a strategic approach to managing population health initiatives. By sorting patients based on risk level and identifying clusters of need, health system team members can perform outreach and interventions to maximize the quality of patient care and the predictive model’s effectiveness.
Building a successful model requires the right technology stack, human oversight and intervention, and most importantly, quality data to fuel the machine. Disparities in data collection, the type of data sources available, and limited interoperability between systems can make the data component the most challenging and time-consuming piece of the AI puzzle. Putting the upfront time and resources into managing and understanding available data allows organizations to more easily build predictive AI models to support their population health efforts.
Organizations have access to hundreds, if not thousands, of data points from various sources—EMRs, Health Information Exchanges (HIEs), claims, social determinants of health, etc.—and these sources often have overlapping data. Thoroughly understanding the desired model and outcome allows team members to zero in on the best data sources to utilize.
It’s important to note that more is not always better, as overwhelming the model with unnecessary data points can lead to confusion and difficulty in maintenance. The goal should be to build the simplest model possible with the maximum predictive power.
Building the model itself is fairly simple when compared to the data management component. Therefore, data scientists should avoid forcing a model built and used for one organization to work elsewhere. Patient populations, data collection practices, data sources and more can vary widely from one health system to another, rendering a “recycled” model useless in a new environment. Data users at each organization must determine their type of data, source, and desired outcome.
Both the input and output of data for a predictive AI model require certain technology. In addition to platforms used internally by health systems, integrated patient tools, such as member portals and biometric devices, can provide meaningful data to help segment populations that may be at risk for poor outcomes. Population health EMRs optimized to provide AI-generated recommendations directly into existing workflows create a seamless experience and allow team members to provide more timely care.
On the backend, a data and analytics platform that can aggregate data from all sources is another must-have in the technology stack. A robust data platform can perform business logic and make it actionable, provide visually appealing dashboards to understand financial and clinical performance, and perform important predictions for population health management.
Lastly, an interoperability platform that allows information to flow freely between systems is essential. For example, an open platform like the Health Catalyst Data Operating System (DOS™) enables health systems to extract data from various source systems and aggregate disparate data sets using healthcare-specific terminology. The result is powerful analytics and insights that support predictive models.
While machines are great at recognizing patterns and running calculations, they don’t take the place of reasoning, logic, and interventions that team members must perform. Human oversight and input are critical in determining data sets and resources available and defining the appropriate outreach or intervention for each outcome or patient population. The machine’s output doesn’t necessarily determine the next steps—it merely allows for more informed risk stratification and identifies opportunities for patient engagement.
Surprisingly, a successful predictive AI model will appear to degrade over time. A newly launched model may identify large patient populations as being at-risk. But, over time, when paired with an effective intervention, that model will appear incorrect.
For example, a model may say that a patient is at risk for hospital readmission. In response, the care team puts the appropriate intervention in place and prevents the readmission. Suddenly, it looks as if the model is wrong because the patient wasn’t actually readmitted. When this disconnect occurs, predictive model users need to capture what the interventions are and understand why performance is going down.
Machines can help separate whether the low performance results from poor predictability due to data or risk level changing, or whether it is due to a correctly implemented intervention. If the latter is true, then the team has created a successful model.
Building successful predictive models for population health requires the right combination of data, technology, and human intervention. The journey requires continual learning, understanding the data fueling the outcomes, and optimizing models and interventions for the most predictive performance and best quality of care.
Would you like to learn more about this topic? Here are some articles we suggest:
Would you like to use or share these concepts? Download the presentation highlighting the key main points.