BI Tools: 5 Reasons Why They Can’t Replace Your Healthcare Data Warehouse
At Health Catalyst, we believe an enterprise data warehouse (EDW) is the only viable solution for health systems and physician groups looking to use analytics to drive sustainable quality and cost improvements across an entire organization. It’s so integral to success that Level 1 of the Healthcare Analytics Adoption Model is implementing an EDW.
This fact has resulted in a number of business intelligence (BI) tools and visualization solutions are being marketed as cloud data warehouses that offer quick analysis and flexible visualizations in user-friendly packaging. Promises of virtual data warehouses have been around for some time and, on paper, they sound great. The premise is that these systems forego the steps of building and loading data into an EDW by extracting data directly from the source transactional systems. Users query the BI tool, which quickly extracts the information from finance systems, EHRs, lab systems, and other sources. Then, at the last-minute, the tool mashes the information together in the visualization layer to provide the user with the answers he or she was looking for.
But the laws of physics – CPUs, memory, and disc reading – mean that these visualization tools can’t yet live up to their promises or replace an EDW. Here’s why.
What Healthcare Business Intelligence Tools Do Very Well
The core strength of front-end BI tools is visualizing data and exposing it to end users. First and foremost, these are reporting tools that capture a snapshot of information at a particular point in time and provide access to that information via easily digestible charts, graphs, and similar visualizations. BI tools may also offer a certain level of drilldown to the data itself. Additionally, because many BI vendors offer a cloud-based option, these tools make the visualizations available – securely – whenever users want to access them.
Why You Still Need a Healthcare Enterprise Data Warehouse
However, no matter how slick or accessible a visualization is, its effectiveness is limited if it’s not based on a centralized, robust data foundation. Unlike EDWs, which send all of the data down a single pipeline and put it into an EDW built specifically for analytics, BI tools pull information that they need directly from their sources. While this may sound efficient initially, at closer inspection, it can lead to a number of shortcomings.
Shortcoming #1: BI tools don’t optimize healthcare data. Anyone who has undertaken a data-warehousing project knows that aggregating data from source systems uncovers a slew of data-quality issues. In fact, optimizing data and exposing data-quality issues represents a significant chunk of the effort in the initial stages of an EDW project.
Say, for example, your query includes gestational age. But gestational age may be stored in 12 different locations within the same EHR and stored in a number of different formats. With an EDW, information from all of these source systems is copied directly into a repository, so you can set up an agreed-upon and consistent definition for gestational age that everyone in the organization can use. This approach also exposed you to problems with the data so you can isolate them and work to correct the data at its source.
BI tools don’t make it quite so simple. Your query may pull details from three separate transactional systems, each of which includes gestational age. You, therefore, have to define which gestational age you’d like to use for the purpose of this query. And now you run the risk of the definition you chose being different from another department’s choice. The end result can be similar queries that aren’t based on the same data points, which leads to reporting discrepancies.
Additionally, with data being pulled directly from multiple source systems, isolating problems with the data and subsequently fixing the problem can be a challenge. The ideal is to have everyone working with the same set of definitions, regardless of query. That way all reports and analyses are made using consistent data.
Shortcoming #2: BI tools can’t handle large amounts of healthcare data. BI tools rely on other systems to store the data, so every time a BI tool wants to access data, it goes straight to the transactional system that is the source of that data. But conducting analysis in a transaction system leads to its own problems: these systems are built to store transaction details, one person at a time. When a BI tool tries to conduct analysis in that same environment, it can result in a huge drain on the system, which is felt by everyone, particularly the people on the front lines trying to use the transactional system.
Consider this: one patient encounter can generate hundreds of rows of data associated with billing systems, EHRs, and a variety of other sources. Now, say you have a query that addresses thousands of patient visits. With a BI tool, your query will be accessing millions and billions of rows of data in various source systems. For a small, independent hospital, this may be possible. But running this type of query directly against the transaction systems in a medium-sized hospital or even a health system with two small hospitals doesn’t scale. The sheer volume of information a BI tool is sorting through makes the process inefficient. Reports are slow to generate, and, worse, people who need to input information into the transactional systems are also faced with delays and inefficiencies. Multiple queries can generate more problems: each query goes straight to the sources, which results in a web of redundant data streams – another drain on the system.
An EDW, on the other hand, is optimized for analytics. It is designed to facilitate the analyses across entire populations of patients or events. Apps are created that call on the information in the data warehouse rather than going directly to the transactional system. Queries on the EDW are run against the information stored there, which eliminates redundant feeds and ensures the efficiency of transactional systems isn’t adversely affected.
Shortcoming #3: BI tools don’t work well with healthcare data at different levels of granularity. Many BI tools cannot handle data at different levels of granularity. In data warehousing, granularity refers to the level of detail stored in a database and how that level relates to other data. For example, one database list might store patients; another list might store individual patient encounters, which is a finer grain. The lists have a one-to-many relationship (since one patient can have many encounters).
Aggregating and digesting one-to-many and many-to-many relationships in the data requires a sophisticated system, especially when millions of rows of data are involved. How well your data warehouse handles and displays granularity is very important for analysis and system performance. While some BI tools are adept at displaying data with different grains in intuitive and easy-to-use ways, other BI tools have a very difficult time – or simply can’t display data with varying levels of granularity at all.
Shortcoming #4: BI tools can’t optimize healthcare data for multiple user types. How many departments and people could benefit from data if it were readily available to them? Data architects, executives, nursing managers, clinician, the person responsible for regulatory reporting … the list goes on and on. And each person is looking for different insight from the same data.
Applying logic against the data so it is understandable at multiple levels for different audiences is something BI tools simply cannot do. BI tools focus on providing answers to specified sets of questions; an EDW can allow those questions to be customized by the audience. With a true EDW, you can build subject-specific data marts to answer specific questions – questions that originate with the audience and that are customized and standardized to meet the audience’s unique needs. In the process of creating these data marts, you engage the different audiences, which helps them understand the data – and even their own processes – better, too.
Shortcoming #5: BI tools don’t provide for modularity, understandability, and code reuse. A BI tool limits the ability to reuse code and logic. Often, in reporting and analysis, one data architect might maintain the code/logic for six months, and then another might need to learn it and further maintain it for the next six months. If the first data architect uses one preferred tool, and the second data architect uses a different preferred tool, the code will be specific to each particular tool, and significant effort may be needed to move the code to the new tool.
A well-designed data warehouse includes sets of logic that work well in stand-along sets (for example, two or three files of code/logic that, combined, show a population with hospital-acquired pneumonia and specific metrics for that population, such as readmission rate).
When the logic and code are stand-alone (and not bound to the particular BI tool), they can be more easily and inexpensively reused elsewhere in the organization at a relatively low incremental cost, without having to keep the logic in distributed desktop files. The data warehouse not only stores a central repository of data, but it also stores centralized logic. A further benefit of this approach is understandability of the code and transfer of knowledge. Data architects need to understand why the code was written a certain way to understand how to maintain a particular data mart or to reuse the logic in another data mart. Centralized logic, code, and architect comments within the code support this goal.
In short, data-driven healthcare transformation requires an EDW. While BI tools remove the steps of building and loading data into a data warehouse, this shortcut comes at a very big price. BI tools cannot provide the consistency, efficiency, or meet the analytics demands of today’s healthcare industry.
What limitations have you found in your BI tool or visualization solution? What do they do well?
Would you like to use or share these concepts? Download this presentation highlighting the key main points.