Why Predictive Modeling in Healthcare Requires a Data Warehouse

getting patients involved. For example, patients might be asked to fill out online functional status surveys at regular intervals. In select healthcare settings, remote monitoring data may also be routinely available. By feeding this kind of data into the predictive model for the target patient population over time, and analyzing it by age, gender, medication, geographical location, and other variables, researchers can develop much more specific predictive models than they could with a general hospital or ambulatory care population.

It is impossible to aggregate and normalize this information for analysis without an advanced data warehouse that allows continuous updating and flexible report generation. To deliver actionable insights, moreover, this data warehouse must be able to integrate all of the available information on a patient in the context of what clinicians want to know. In a rich EDW environment where patient details are available in this context and can be fed into a predictive tool, the interventions driven by that predictor are more likely to be successful than they would be when a single-purpose, point solution is applied to data in an information silo.

Population Health Management

One of the fastest ways to derive value from predictive modeling is to apply it to population health management. This involves a form of predictive analytics known as risk stratification, which classifies patients by their risk of getting sick or sicker within the next year or some other time period.

In population health management, the ability to do this is critical, because only 30 percent of patients who are high-risk today were in that category a year ago.16 By accurately predicting who is most likely to get sick, organizations can set priorities and focus their care management and patient engagement activities on the people who need it the most.

In a 2013 Issue Brief published by the Colorado Beacon Consortium, Asaf Bitton, M.D., MPH, FACP, noted, “Risk stratification is an intentional, planned and proactive process carried out at the practice level to effectively target clinic services to patients.” In that same Brief, the Consortium’s Executive Director, Patrick Gordon, identified three goals for risk stratification:

  • Predict risks
  • Prioritize interventions
  • Prevent negative outcomes (e.g., disability and death, as well as unnecessary costs)17

Risk stratification requires sophisticated algorithms, robust registries or data warehouses, and the ability to integrate multiple sources of data. “The more data you have, the better able you are to predict outcomes,” Gordon noted. “Access to more actionable data within a process driven by clinical judgment and shared decision-making improves the ability of a practice team to proactively align resources with patient needs.”18

Administrative Applications

An organization can also use predictive analytics to increase the efficiency of certain operations. For example, if there are certain disease patterns – such as biannual outbreaks of upper respiratory infections in children or a spike in asthma attacks triggered by worse air quality in certain seasons – algorithms can be devised to help healthcare systems better manage their supply chain and staffing.19 If an organization can predict changes in demand for services, it can ensure that sufficient supplies are on hand or that nurse staffing is adequate take care of patients on a particular shift.


Healthcare solutions

Healthcare organizations can approach the use of analytics solutions, including predictive analytics, in several different ways. One option is to outsource their business and clinical intelligence work to analytics service providers. This approach, which doesn’t require any investment in hardware, software, or internal expertise, can help providers improve their internal and external reporting and can enable them to benchmark their performance. But the providers who outsource these functions are limited in the kind of analyses they can perform and are unable to adapt the analytic tools to meet their specific needs.

Second, organizations can adopt “best of breed” point solutions. These standalone applications provide detailed analytics for a specific domain, such as readmissions, but don’t supply an overall perspective on patient care and costs. They also can’t be easily integrated into a broader infrastructure that would increase their usefulness.

A standalone predictive tool cannot be used to analyze the health risks of subpopulations because the data is not readily available. For example, unless an organization leverages a data warehouse for predictive analytics, it can’t produce comprehensive reports on all patients over 65, all women who recently gave birth, or all people who went to the ER because they recently overdosed on drugs. The warehouse environment allows pertinent but disparate data sources to be mapped and combined. This is the kind of complete information that a predictor needs to distinguish the signal from the “noise” in the data and make accurate forecasts.

Some healthcare organizations look to their electronic health record (EHR) vendors for analytics capabilities. According to a recent survey, more than half of the hospitals that use clinical and business intelligence employ the analytic modules embedded in their EHR or hospital information system. Such tools can be used to generate reports related to Meaningful Use objectives.20 But these analytics often lack robustness and flexibility. In addition, the data they use comes solely from the EHR. As a result, the analytics lack an integrated view of clinical, financial, administrative, and patient satisfaction data.

Finally, organizations could build an optimal infrastructure for generating analytic insights before deploying predictive analytics. Such an approach, based on the use of an advanced enterprise data warehouse (EDW), provides the highest degree of analytics flexibility and adaptability. It can drive an analytics strategy that will enable an organization to adapt to both short-term and long-term changes in healthcare. Most importantly, this rich EDW environment enables meaningful intervention if the organization connects its analytics to care management. More information is available at /choosing-the-best-healthcare-analytics-solutions-html.


Organizations that take the road to predictive analytics described above should study the Analytics Adoption Model that was developed by a group of healthcare industry veterans, including hospital CIOs and healthcare consultants. This eight-level model provides a road map for organizations to measure their own progress toward analytic adoption.

Level one of this schema consists of fragmented point solutions that are not integrated with a data repository or with each other. In Level 2, organizations build an enterprise data warehouse (EDW) for clinical and administrative data with a master vocabulary, a patient registry, and basic data governance.

In Levels 3 and 4, providers begin to use the warehouse for automated internal and external reporting. Key performance indicators are visible to both frontline managers and executives. Analytics are used to produce reports required for regulatory and accreditation purposes, specialty society databases, and payer incentives (e.g., the Meaningful Use EHR incentive program, the Physician Quality Reporting System, and the Medicare value-based purchasing program).

The goal of analytics in Level 5 is to measure clinical effectiveness that maximizes quality and minimizes waste and variability. Data governance supports care management teams involved in population health management. The EDW is expanded to include clinical data from labs and pharmacies, as well as claims data.

Level 6 is designed for organizations that take bundled payments and accountable care organizations that share financial risk and reward. Analytics are available at the point of care to help organizations achieve the Triple Aim of improving quality, efficiency, and the patient experience.

In Level 7, analytics are further expanded to address fixed-fee reimbursement models (i.e., risk contracts). Predictive modeling and

Page 3 of 4
1 2 3 4