What Is Data Mining in Healthcare?
patients with certain conditions who were readmitted within 30 or 90 days—you can mine that data to create an accurate predictive algorithm. The following is a high-level description of steps to learn from a historical cohort and create an algorithm:
- Define a time period (the parameters of the historical data).
- Identify all of the patients flagged for readmission in that time period.
- Find everything those patients have in common (lab values, demographic characteristics, etc.).
- Determine which of these variables has the most impact on readmissions. You can do this mathematically using a variety of statistical models.
An Introduction to Training Predictive Algorithms
The process of building and refining an algorithm based on historical data is called training the algorithm. We typically use about two-thirds of the historical cohort to train the algorithm. The other third is used as a test set to assess the accuracy of the algorithm and ensure that it isn’t generating false positives or negatives.
One important aspect of creating a predictive algorithm is getting feedback from clinical experts. Using an algorithm to make an impact in today’s care—which is our goal—requires buy-in from the clinicians delivering care on the frontlines. For them to own the algorithm, trust the data, and incorporate new processes into their workflow, incorporating their feedback is critical.
The More Specific the Algorithm, the Better
Rather than train an algorithm specific to cases like heart transplant or heart failure, many organizations rely on all-cause or general readmissions data to predict readmissions. However, most of these generic algorithms are only about 75 percent accurate. It’s a start, but it isn’t enough.
The extra effort to train an algorithm based on a specific population—say, a cardiac population—will jump the algorithm above 90 percent accuracy. If you can define a very specific problem or population—and identify the characteristics unique to that population—the algorithm will always be better. You can use a generic algorithm as a starting place, but to be truly successful you will need to add factors specific to defined populations.
Readmissions in the Real World: A Health System’s Improvement Initiative
In an ideal situation, health systems would have all of the historical data they needed, would train the algorithm, and would quickly start using predictive analytics to reduce readmissions. In the real world, things can be a little bit messier. Health systems don’t always have the historical data they need at the outset. Sometimes the health system has to improve documentation first and build up the necessary data before launching predictive analytics.
That was the case with one of our health system clients. Rather than starting to implement predictive algorithms immediately, they used an EDW and advanced analytics applications to begin a readmissions initiative with only general readmissions baselines to guide them.
This client decided to begin by focusing on a specific cohort: heart failure (HF) patients. We worked with them to create a data mart for their HF population so they could track readmissions rates and assess how the quality interventions they implemented affected those rates.
The health system’s team gathered best practices from the medical literature and decided to use interventions that included:
- Medication review. Clinicians are required to review medications with HF patients at discharge.
- Follow-up phone calls. A nurse calls to check that the patient is following the health regimen appropriately (within seven days for high-risk cases and 14 days for other cases).
- Follow-up appointment scheduled at discharge.
As they implemented these best practices, the data flowed into the EDW, and the team was able to see:
- How compliant clinicians were in using the best practices.
- How these best-practice interventions affected 30- and 90-day readmissions.
They also began to add other best practices to their set of interventions.
At that point, the client hadn’t yet set up an algorithm to predict risk. Rather, they relied on physicians to flag patients as high risk. Because the health system needs to refine its processes and ensure that the right amount of resources are being devoted to high-risk and rising-risk patients, the team is now turning its attention to predictive algorithms as a method for streamlining its processes and making more effective interventions.
Data mining can also help this health system streamline its efforts by evaluating the relative efficacy of each best practice. For example, if a case manager only has time to apply some of the interventions to a patient, which intervention or combination of interventions will have the most impact?
This brief case study is illustrative of what applying data mining in the real world is all about. If the health system had waited until its stars were perfectly aligned before getting started on its initiative, it might still be waiting today. Perhaps that’s why data mining so often doesn’t make it out of the academic lab and into everyday clinical practice. But this is the type of effort that is required—the determination to iterate step by step in a process of continuous quality improvement.
Data Mining in Healthcare Holds Great Potential
As stated earlier, today’s healthcare data mining takes place primarily in an academic setting. Getting it out into health systems and making real improvements requires three systems: analytics, best practice, and adoption, along with a culture of improvement. We hope that showing these real-world examples inspires your team to think about what is possible when data mining is done right.
Has your organization attempted any data mining initiatives? What predictive models have you implemented successfully? What challenges have you faced juggling fee-for-service and shared-risk contracts? How has analytics helped you tackle those challenges.
Would you like to use or share these concepts? Download this data mining in healthcare presentation highlighting the key main points