Database vs Data Warehouse: A Comparative Review
A question I often hear out in the field is: I already have a database, so why do I need a data warehouse for healthcare analytics? What is the difference between a database vs. a data warehouse? These questions are fair ones.
For years, I’ve worked with databases in healthcare and in other industries, so I’m very familiar with the technical ins and outs of this topic. In this post, I’ll do my best to introduce these technical concepts in a way that everyone can understand.
But, before we discuss the difference, could I ask one big favor? This will only take 10 seconds. Could you click below and take a quick poll? I’d like to find out if your organization has a data warehouse, data base(s), or if you don’t know? This would really help me better understand how prevalent data warehouses really are.
Before diving in to the topic, I want to quickly highlight the importance of analytics in healthcare. If you don’t understand the importance of analytics, discussing the distinction between a database and a data warehouse won’t be relevant to you. Here it is in a nutshell. The future of healthcare depends on our ability to use the massive amounts of data now available to drive better quality at a lower cost. If you can’t perform analytics to make sense of your data, you’ll have trouble improving quality and costs, and you won’t succeed in the new healthcare environment.
The High-level Distinction Between Databases and Data Warehouses
What I will refer to as a “database” in this post is one designed to make transactional systems run efficiently. Typically, this type of database is an OLTP (online transaction processing) database. An electronic health record (EHR) system is a great example of an application that runs on an OLTP database. In fact, an OLTP database is typically constrained to a single application.
The important fact is that a transactional database doesn’t lend itself to analytics. To effectively perform analytics, you need a data warehouse. A data warehouse is a database of a different kind: an OLAP (online analytical processing) database. A data warehouse exists as a layer on top of another database or databases (usually OLTP databases). The data warehouse takes the data from all these databases and creates a layer optimized for and dedicated to analytics.
So the short answer to the question I posed above is this: A database designed to handle transactions isn’t designed to handle analytics. It isn’t structured to do analytics well. A data warehouse, on the other hand, is structured to make analytics fast and easy.
In healthcare today, there has been a lot of money and time spent on transactional systems like EHRs. The industry is now ready to pull the data out of all these systems and use it to drive quality and cost improvements. And that’s where a data warehouse comes into play.
Databases versus Data Warehouses: The Details
Now that you have the overall idea, I want to go into more detail about some of the main distinctions between a database and a data warehouse. Because I’m a visual person (and a database guy who likes rows and columns), I’ll compare and contrast the two in following table format: