Why Predictive Modeling in Healthcare Requires a Data Warehouse

risk stratification are deployed to support population health management. Data sources include home-monitoring, long-term-care, and patient-reported outcomes data.

Level 8 expands the role of analytics to include wellness management, physical and behavioral functional health, and mass customization of care. Prescriptive analytics – a combination of insights from predictive analytics with clinical decision support – are available at the point of care to help clinicians determine which interventions are appropriate for each patient. In the future, the data content at this level will include continuous biometric data, genomic data, and familial data.

Healthcare Analytics

Note that predictive analytics don’t emerge in this model until Level 7 (although they could arguably be used in Level 6, as well). Organizations that attempt to leapfrog the earlier levels in order to apply predictive tools will find their efforts hampered by an inadequate infrastructure. It is impossible to do predictive modeling, for example, before an organization even automates the reporting process in its EDW.


To use predictive analytics effectively for improving patient outcomes and managing population health, an organization must have an EDW. Yet by 2011, only approximately 30 percent of U.S. hospitals and healthcare systems had an EDW; a 2013 report suggests that number hasn’t changed much.21-22 Moreover, the vast majority of those EDWs use an antiquated architecture that isn’t flexible enough to make the insights of predictive analytics actionable.

To understand why, one must know the difference between “early-binding” and “Late-Binding™” models for data warehouses.

Data can be “bound” to business rules that are implemented as algorithms, calculations, and inferences acting upon that data. In healthcare, this data binding may be done for calculating the length of stay, attributing a primary care provider to a particular patient with a chronic disease, or data definitions of disease states for patient registries, among other things. In addition, data can be bound to vocabulary terms such as patient identifier, provider identifier, location of service, gender, diagnosis code, and procedure code.

Early-binding models, which characterize most legacy EDWs, are based on large software programs that bind data to business rules or vocabularies before they are compiled. By definition, these are static models. If the software must be modified because of new business rules or requirements, it is a very time- and labor-intensive process that can take 12 to 18 months to complete in a large organization. By the time the changes have been made for a specific kind of predictive modeling, the use cases may have changed, requiring a whole new set of modifications in the program.

In a Late-Binding™ model, programs are broken down into modules or objects that support particular business services and processes. These modules are assembled as needed at run time, rather than being compiled beforehand. By using this kind of architecture, an EDW can provide analytic value in days or weeks rather than months or years. Such an approach allows organizations to adapt easily to changing requirements. More information is available at /choosing-the-best-healthcare-analytics-solutions-html.

The following fundamental principles apply for all data modeling, especially when used in predictive analytics:

  • The key to success in data warehouses is relating data, not modeling data. Data should be modeled only to the extent necessary.
  • Data from various source systems should be leveraged directly to minimize the amount of data normalization required.
  • Data models should be applied to mapped subsets of data, such as EHR, claims, prescription, cost, and patient satisfaction data.
  • Some core data elements are fundamental to nearly all analytic use cases in healthcare. Those elements can be bound early, but remaining data should be bound to other terms and business rules later and only when required by use cases.


Predictive analytics are rapidly emerging as a “must-have” class of analytics tools that healthcare organizations can use to manage population health, reduce readmissions, and improve patient outcomes. But providers should not have unrealistic expectations of what these analytics can do. The type and quality of available data, including outcomes data, limit the usefulness of predictive analytics. In addition, organizations must couple these analytics with other tools, such as outreach and care management applications, to access the full potential of predictors in patient care.

Before deploying predictive modeling tools, healthcare systems should develop sophisticated data warehouses. Studies over the past few years show that most organizations still have work to be done in this area, although the path to achieving high-functioning data warehouses is clear. While point solutions and outsourcing options are available, and some EHR vendors offer analytics packages as well, building an EDW that allows the rapid assembly of patient data in context offers the most flexibility and the greatest range of possibilities for using predictive analytics.

The type of data warehouse that an organization chooses is critically important to its eventual success in using predictive analytics. A late-binding EDW enables the healthcare system to move nimbly from one use case to another, providing the timely insights that clinicians can translate into effective action. In contrast, early-binding models are static monoliths that are time consuming and difficult to modify as the requirements of clinicians, administrators, and regulators change.

At the end of the day, the organization that makes the effort to build an advanced, late-binding enterprise data warehouse will be able to successfully implement predictive analytics for a wide variety of purposes. More importantly, the organization will also be able to gauge the effectiveness of resulting interventions. These organizations will be well prepared to meet the manifold challenges facing them as healthcare is transforming.


  1. Ken Terry, “Futuristic Clinical Decision Support Tool Catches On,” InformationWeek Healthcare, Jan. 27, 2012, accessed at https://www.informationweek.com/healthcare/clinical-information-systems/futuristic-clinical-decision-support-tool-catches-on/d/d-id/1102497.
  2. Ben-Chetri E, Chen-Shuali C, Zimran E, Munter G, Nesher G. “A simplified scoring tool for prediction of readmission in elderly patients hospitalized in internal medicine departments.” Isr Med Assoc J. 2012 Dec;14(12):752-6.
  3. Hasan O, Meltzer DO, Shaykevich SA, Bell CM, Kaboli PJ, Auerbach AD, Wetterneck TB, Arora VM, Zhang J, Schnipper JL. “Hospital readmission in general medicine patients: a prediction model.” J Gen Intern Med. 2010 Mar;25(3):211-9. doi: 10.1007/s11606-009-1196-1. Epub 2009 Dec 15.
  4. Donze J, Aujesky D, Williams D, Schnipper JL. “Potentially avoidable 30-day hospital readmissions in medical patients: derivation and validation of a prediction model.” JAMA Intern Med. 2013 Apr 22;173(8):632-8. doi: 10.1001/jamainternmed.2013.3023.
  5. J. Frank Wharam and Jonathan P. Weiner, “The Promise and Peril of Healthcare Forecasting.” Am J Manag Care. 2012;18(3):e82-e85)
  6. Ibid.
  7. HIMSS Analytics, “Clinical Analytics: Can Organizations Maximize Clinical Data?” June 7, 2010. Accessed at http://www.himss.org/files/HIMSSorg/content/files/AnvitaWhitePaper.pdf.
  8. HIMSS Analytics, “Clinical Analytics in the World of Meaningful Use,” Feb. 2011. Accessed at http://www.himss.org/files/himssorg/content/files/20110221_Anvita.pdf.
  9. Institute for Healthcare Technology Transformation, Analytics: The Nervous System of IT-Enabled Healthcare.
  10. Terry, “EHR Data Not Ready for Prime Time, Studies Show,” iHealthBeat, Feb. 9, 2012, accessed at http://www.ihealthbeat.org/insight/2012/ehr-data-not-ready-for-prime-time-studies-show.
  11. Terry, “ACOs Need Claims Data for Analytics, Expert Says.” InformationWeek Healthcare, Sept. 16, 2013, accessed at https://www.informationweek.com/healthcare/electronic-health-records/acos-need-claims-data-for-analytics-expert-says/d/d-id/1111552.
  12. Kogon B, Jain A, Oster M, Woodall K, Kanter K, Kirshbom P, “Risk factors associated with readmission after pediatric cardiothoracic surgery.” Ann Thorac Surg. 2012 Sep;94(3):865-73. doi: 10.1016/j.athoracsur.2012.04.025. Epub 2012 Jun 8.
  13. Suni Kripalani, Amy T. Jackson, Jeffrey L. Schnipper, and Eric A. Coleman, “Promoting Effective Transitions of Care at Hospital Discharge,” Journal of Hospital Medicine 2007;2:314–323.
  14. Stephen F. Jencks, Mark V. Williams, and Eric A. Coleman, “Rehospitalizations Among Patients in the Medicare Fee-for-service Program,” N Engl J Med 2009; 360:1418-1428.
  15. HealthDay News, “Rothman Index Helps ID Patients at Risk for Readmission,” Aug. 27, 2013, accessed at https://www.nursingcenter.com/healthdayarticle?Article_id=679464.
  16. Ian Duncan, Healthcare Risk Adjustment and Predictive Modeling (Winstead, CT: ACTEX Publications, 2011)
  17. Colorado Beacon Consortium, Issue Brief, Vol. 2, Issue 2, accessed at http://files.ctctcdn.com/082eb2a6001/4023d0d5-6532-4471-a017-8a30c71c0acc.pdf.
  18. Ibid.
  19. Natalie Burg, “How Tech Tools Make Supply Chains Less Risky,” Forbes, Sept. 10, 2013, accessed at https://www.forbes.com/sites/ups/2013/09/10/how-tech-tools-make-supply-chains-less-risky/#79b3ab23d170.
  20. HIMSS Analytics press release, “HIMSS Analytics Releases 2013 U.S. Clinical & Business Intelligence Survey,” Sept. 9, 2013.
  21. HIMSS Analytics, “Clinical Analytics in the World of Meaningful Use,” Feb. 2011. Accessed at http://www.himss.org/files/himssorg/content/files/20110221_Anvita.pdf.
  22. Ken Terry, “Hospitals in Early Stage of Analytics Usage,” InformationWeek Healthcare, Sept. 11, 2013, accessed at https://www.informationweek.com/healthcare/clinical-information-systems/hospitals-in-early-stage-of-analytics-usage/d/d-id/1111502.


Page 4 of 4
1 2 3 4
Next Page