The Healthcare Data Warehouse: Evolved for Today’s Analytics Demands


Far from the first generation of reporting systems, today’s enterprise data warehouse (EDW) is a robust, agile tool that’s becoming indispensable in the contemporary healthcare environment. The healthcare data warehouse is evolving (the way it captures, analyzes, and consumes data) and with it, the industry;s insatiable need for immediate data. The most effective, comprehensive EDW can manage the increasing demands for timely, quality data.

Enabling Multiple Data Sources and More Timely Reporting

Before the EDW, data teams used a decision support server (DSS). This system was typically a copy of a transactional source (e.g., EMR, claims, and billing). Before the DSS, transaction systems were subject to slowdowns when users made large report requests during busy transaction times. By regularly copying the transaction system’s data to a DSS during the least busy times, reports could be offloaded to the DSS system and have no effect (slowdowns or lockups) on the source transaction systems. This allowed deeper analysis of data for one system without affecting another user. 

The constraint of the DSS, however, was taht it typically represented just one source system, and the data remained structured like the source—which meant it was optimized for transactions (inserts, updates) and not reporting. So, while a DSS wasn’t slowing down the transaction system, it wasn’t necessarily fast at creating reports.

As data stores grew, the need for something larger (with more data capacity) and more versatile emerged: an EDW. The EDW has key data across an entire organization, and, unlike the single-source DSS, it represents many transactional systems and/or data sets. The EDW combines data into one area for more convenient analysis and reporting. Additionally, the EDW’s ability to do analysis across multiple data sets adds tremendous value: instead of separately reporting patient satisfaction and length of stay (LOS), users can look at this information together to help draw conclusions about the cause and effect of healthcare practices.

The EDW can also transform data from organizational sources to optimize it for reading and reporting. Instead of using a copy of data (as with the typical DSS), the EDW modifies the data into a format that’s easier to report on—which adds efficiency to the EDW’s benefits as a reporting system.

Though more organizations prefer an EDW for reporting, some health systems are holding onto outdated EDW perspectives. They’re missing the EDW’s real potential—from storing massive amounts of diverse data (e.g., clinical, claims, financial, patient satisfaction, pharmacy data, and more) to generating actionable, near real-time (NRT) reports that drive process and quality improvement. Organizations that haven’t embraced the evolved functions of today’s EDWs or backed them with the investment and technology they warrant, risk missing out on this highly valuable and agile tool.

Four Misconceptions that Sell the EDW Short

Health systems that hold onto an older EDW perspective make it a lower priority among their technology. This means they don’t provide the EDW with adequate technology investment, ongoing support, and necessary upgrades. There are four common misconceptions about the EDW that significantly limit its capabilities and cause organizations to miss out on important potential:

Misconception #1: A data lag of 24 hours or more is always acceptable in an EDW

EDW users have become accustomed to warehouse data sources loading on a nightly, or sometimes weekly, basis. For some types of reporting, a nightly or weekly load is sufficient, but the thinking about how often data is loaded must adapt to the business need. Many assume less frequent loads are due to constraints in the warehouse’s ability. That’s not necessarily true for many technologies that support an EDW today, however.

Many of today’s transactional sources have advanced data locking mechanisms that allow more frequent access to data—and thus more frequent loading to an EDW—without interrupting transactional processing. This allows data loading more often throughout the day (versus only at night), distributing EDW processing over a wider period. There may be opportunities to load data three times a day, every hour, or in some instances, approach NRT and load data every 15 minutes or less. By rethinking the EDW loading process, health systems can load smaller amounts of data more frequently and distribute resources needed for data loading more efficiently. As a result, they benefit from more timely access to data.

Staffing systems, such as nursing station systems, for example, run optimally when data is loaded very frequently (in real time). These systems match nurses’ names with the patients their assigned to care for and show critical patient data—including why they’re hospitalized, medications, procedures, and risks. If this information is not always up to date, a nurse won’t know of important changes to the patient’s status of changes in medication or treatment.

Misconception #2: Important decisions should never be made based on EDW data

There is often hesitancy for healthcare organizations to make decisions based on EDW data. This is especially true for decisions that have large financial implications. This hesitancy comes from the idea that data extracted from its source to the EDW automatically puts the data quality into question. It is wise and appropriate to have data validation processes in place to help ensure data quality in all systems to make informed decisions from accurate data. It is unwise, however, to disregard EDW data for important organizational decisions simply because it is not the origin or source of the data. With good data validation processes in place, the EDW can be and should be providing healthcare organizations with information to help make important business, clinical, and process decisions.

The hesitancy of trusting data that has been moved from the source to the EDW is frequently accompanied with a dangerous assumption. The assumption is, “source data is always correct.” Source data is just that—it is where the data came from, but that doesn’t mean it is accurate. Ultimately, all data that goes into a source comes from some input, process, or algorithm designed by a person—and it’s clear that people make mistakes! Despite data quality challenges everyone faces, with a culture of data transparency and sound validation techniques, healthcare organizations can significantly improve their decision using the deep and broad data sets of an EDW. (Read The Surprising Benefits of Bad Healthcare Data to learn more about creating a culture of data transparency and improvement.)

By applying the same data quality validation methods used in other systems, the EDW can be a verified source for making important and impactful organizational decisions—even ones with significant financial impact. As with any system, it is important to consider how the data is being used and to apply the appropriate rigor to validation processes. Some data is used to determine trends and needs to be directionally correct. The validation process for this type of data would be different from data used for making physician bonus decisions or submitting financial audit reports, which clearly requires more stringent accuracy requirements. With the implementation of suitable validation steps and the appropriate level of effort for the data and its use case, the EDW can become a crucial tool in organizational decision making.

A mature EDW can have years of historical data and be representative of many source systems. By using EDW data to help in decision making processes, healthcare organizations will have richer data sets for providing better context and support for decisions being made. For example, decisions about investing in specific areas of care could be informed using a combination of procedure, cost, and revenue data that may be spread out in separate source systems, yet tied together in an EDW. The opportunities for analysis and improvement are nearly endless and constantly increasing as more data and data sets are added.

Healthcare organizations should embrace the idea that an EDW can help make more informed decisions. When not dismissed for fear of inaccuracy, the EDW adds additional context for decision making. With good data quality practices in place and validation techniques commensurate for the data being scrutinized, healthcare organizations can leverage an EDW for deeper analysis and make more informed decisions with significant positive impact.

Misconception #3: EDW infrastructure and support are low priorities

In large healthcare organizations with a host of applications and supporting systems, the EDW often ends up shortchanged in terms of supporting infrastructure. This infrastructure is the equipment (computers and servers) that comprises the EDW; its size determines the size of the EDW’s memory, storage, and network. While it’s clear that each system must be prioritized, the misconception that the EDW is at the bottom of the priority list is antiquated and potentially costly thinking.

System infrastructure and support priority are determined in several ways—the power and size of machines a system runs on, the backup processes and failover capabilities, and where a system fits into the plan for recovery (meaning where it falls in the order of systems to be restored in the event of a disaster) are all indicative of the system’s priority in respect to others.

No organization can avoid system prioritization (in terms of upgrades, failover, and investment) and some systems will be given far more resources than others. For a health system, a down EMR has massive implications, as do other systems related to emergency and operating rooms—these must always run. These systems will undoubtedly have resources and redundancies to account for inevitable failures.

An EDW, on the other hand, is often given a lower priority because it is not thought of as a critical system. Recovery plans for an EDW may be as simple as reloading data from sources to essentially recreate the EDW with its data. With organizations now approaching decades worth of data storage, it’s time to rethink the priority of an EDW and what resources it is given. Many organizations aren’t just capturing copies of data, but are also taking snapshots (read-only copies) that only exist in the EDW. Without an EDW, the organization will have no other place to view these snapshots and use them to make data-driven decisions. Measuring patient satisfaction, for example, relies on historical data for metrics such as length of stay.

An EDW for large organizations can have thousands of unique users doing actionable reporting and analysis every day. If adequate resources aren’t given to the EDW in the form of powerful servers, top tier storage, sound redundancies, and efficient recovery plans to restore an EDW’s capabilities, then it could have far reaching effects on an organization’s ability to function. (Remember that good IT people never talk about “if” there are system failures; they plan for “when.”) With the ability to impact and help make daily critical decisions across an organization, the EDW has enormous value and widespread implications if it is too slow or becomes unavailable. Health systems that have previously considered their EDWs as lower priorities should re-evaluate their real value and prioritize them with sufficient infrastructure and resources.

Misconception #4: An EDW can only analyze the past

Even healthcare organizations that have embraced the EDW concept and use it daily may be missing some key capabilities. Much of the analysis and reporting done in a standard healthcare EDW focuses on looking at data from the past to make informed decisions. Now, however, with larger data sets and the growing applications of machine learning, an EDW can provide predictive insights for taking a course of action.

One way to think about machine learning and predictive analytics is to imagine someone very knowledgeable about a business. If this individual could comb through massive amounts of information over an extended time, they would eventually make logical predictions—based on past patterns—about what will likely happen in the future.

Machine learning accelerates this process with computations of a machine/computer. While it’s impossible to predict the future precisely, if a machine can help predict events with 80 percent or better accuracy over the next few months or even a year, then this would yield incredible value for planning and resourcing.

Looking Forward with the EDW

When Indiana Health University (IU Health) identified central line-associated bloodstream infections (CLABSI) as a top priority in their efforts to reduce healthcare-associated infections, they built a risk-prediction model. By adopting an EDW, IU Health gained a database that, when paired with advanced analytics apps, could predict a patient’s risk for CLABSI (as well as other adverse events). Early reports showed that by leveraging the EDW for risk prediction, IU Health predicted 85 to 90 percent of incidence of CLABSI.

The EDW Does Best when It’s Given Appropriate Priority

When a health system dedicates the right technology and resources to its EDW, they have a tool that not only debunks the above misconceptions, but also becomes one of its most important resources.

The evolved EDW provides critical functions:

  1. Organizations can load data far more often than just once a day.
  2. The EDW is an integrated source of data because it combines data from many systems—enabling better context for making good decisions.
  3. High availability makes the EDW accessible for constant and broad use across the healthcare system.
  4. The growing EDW datasets create many opportunities to apply machine learning and predictive analytics for future planning.

Health systems can add far more capabilities to their EDWs by investing more in the physical EDW infrastructure. They also need to prioritize infrastructure with reliable equipment that has failover capabilities and dedicate premier human resources to the EDW in the form of knowledgeable staff and top-tier vendor support.

The bottom line is that EDWs don’t have to be stuck in the past. A well-thought-out and well-maintained EDW with a reliable support infrastructure can be more than just a historical look at data; With the right support and technology to enable NRT reporting, the EDW can serve as a decision-making tool today, and predictive, decision-making tool for tomorrow.

Additional Reading

Would you like to learn more about this topic? Here are some articles we suggest:

  1. 6 Surprising Benefits of Healthcare Data Warehouses: Getting More Than You Expected
  2. DKA Risk Prediction Tool Helps Reduce Hospitalizations
  3. Data-Driven Approach to Improving Cardiovascular Care and Operations Leads to $75M in Improvements
  4. 10 Trends in Healthcare Data Warehousing That Every Health System Needs to Know
  5. Early- or Late-binding Approaches to Healthcare Data Warehousing: Which Is Better for You?
Loading next article...