The Value of Claims Versus EHR Data in Care Management and Population Health Analytics Strategies

My Folder

population healthYou’ve heard the phrase “only the tip of the iceberg” often, referencing the tiny portion (roughly 10%) that floats above water. Similarly, many care management and population health analytics software solutions are based on tip-of-the-iceberg data, i.e., what is obtained from claims. But the depth and breadth of data (the submerged 90%) obtained from electronic health records (EHRs) is potentially far greater. If the adage is true that more is better, then utilization of EHR data to drive disease registries, population health, and care management efforts should be extremely prevalent in most provider organizations. But in many ways, this shift toward broad utilization of EHR data hasn’t yet happened.

What are the reasons so many provider organizations go for the lowest hanging fruit in the form of claims-based analytics? How can similar, and often times greater, ROI be found by starting to more effectively leverage the data in your EHR? How do you get started deriving the greatest possible benefit from an integrated repository of claims and EHR data?

About Claims and EMR Data

Most of us who work in healthcare IT are familiar with the typical sources of data we encounter on a regular basis. But let’s create some working definitions that will apply to how we use these terms.

Let’s refer to claims data as the structured (coded) data that a healthcare provider may transmit to, or receive from, a payer or clearinghouse, and which are intended to justify payment for services rendered on behalf of a specific patient of the provider organization. There are other relevant subsets or categories of claims data, such as pre-adjudicated, adjudicated, hospital vs. professional, and more, that we we’ll omit here, but we strongly recommend that you be familiar with them.

In an increasing number of cases, healthcare providers may also be able to receive claims data from a payer (or clearinghouse) that pertain to “their” patients, but which describe services rendered at different (and often competing) healthcare providers. CMS’ Medicare Shared Savings Program is a great example of a data sharing arrangement like this, which is intended to allow healthcare providers who participate in these Accountable Care Organizations to better understand the complete picture of each patient’s utilization of healthcare services.

Let’s refer to EHR data as any and all data captured and/or stored in an electronic health record system. Generally, we are going to be focused on discrete data, such as diagnoses, allergies, encounters, lab values, flowsheets, etc., rather than textual data, like physician documentation, nursing notes, and scanned documents. The depth and breadth of potential data elements stored in an EHR can be vast (the 90% of the iceberg that’s submerged) and, if quantified, would be orders of magnitude greater than those that are transmitted via claims.

EHR Data Challenges

But if the EHR is such a rich source of data, why is it not utilized to a greater extent by providers looking to improve the delivery of care within their organization?

There are many reasons why providers don’t derive as much analytical benefit from their EHRs as they would like. And provider dissatisfaction with EHR-based reporting tools has helped to drive the product roadmaps of the most popular EHR systems toward standard reporting functionality that has been lacking.

First, providers are often challenged by some of the logistics of accessing their own data. EHR data often feels “locked away” within complex EHR data models that requires analysts to spend hundreds of hours in certification programs to gain an understanding of the vendor’s data structure. When these analysts return home from their training programs, they often realize that their organization has implemented the EHR in such a way as to negate what they just learned about where certain key data elements should reside. For example, the EHR vendor may offer a standard workflow for capturing immunizations, but if the provider organization has made a decision to not use that functionality, the analyst will now have a more difficult time finding where those important data may live.

And while claims data follows a few, relatively strict, standards for both the structure and meaning of the data contained within a claim, trying to interpret EHR data can feel a little like the Wild West. There are few specific standards that relate to the way EHR data is stored within an EHR’s database, and flexible EHR implementation functionality can sometimes lead to critical data being stored in non-machine-readable formats (like text or notes). From the analytic perspective, some of this flexibility can be its own worst enemy when it comes to the value of the resulting data stored in the EHR for understanding the care of populations of patients.

Claims Data to the Rescue?

Healthcare payers and providers alike have been using claims data to power reporting solutions for decades. Claims data is intended to follow a standardized format, although adherence to the standards can be inconsistent. Claims utilize mostly discrete data, and because nearly every healthcare provider has to submit electronic claims in the same format to their payers or clearinghouses, claims provide an easy way to measure a lot of important aspects of healthcare delivery.

One of the potential downsides of a reliance on claims data is the difference in value between billed and paid claims. Billed claims don’t offer the out-of-network view that comes with paid claims data, which is so critical to managing an ACO. On the other hand, paid claims data can be difficult or impossible to obtain for claims outside of your organization. CMS, for example, requires participation in a bundled payment program or that an organization be part of an ACO, in order to get access to the data. And commercial payers will sometimes share claims data with provider organizations, but generally under very specific terms and conditions.

Nonetheless, a lot of people like the claims-based approach because they already have access to the data, and because claims have many of the key data elements needed to complete some of the necessary first steps of population health analytics. For example, claims can be used to approximate the number of diabetic patients based on a coded billing diagnosis, as well as determine high-cost patients based on an analysis of the services described in the claims data.

But claims data also has a key advantage over EHR data from the perspective of completeness: it can be used to more accurately understand the total cost of care of a population, and to identify patients who aren’t getting a lot of their care from your facilities. In an ACO environment, you will get a claim from CMS or maybe a commercial payer, and it will show you what happened to the patient in many cases, wherever they went for treatment. Your EHR is installed within only one provider health system, but those patients may be seeing doctors all over the country or region because they have needs they want treated elsewhere. There are lots of reasons why patients go different places, and with claims data being shared through some type of at-risk payment model, you can see what type of financial impact that patient has on the larger community.

Claims Data Doesn’t Tell the Whole Story

As helpful as claims data can be, it has some significant shortcomings that prevent it from being the ideal source of data for population-based reporting and analytics. Probably the most significant downside to basing your reporting solution on claims data is the inability of claims data to calculate many standard measures of care quality. For example, the numerator criteria for quality measures recommended by organizations like AHRQ and NQF generally look something like this (courtesy of Minnesota Community Measures, “Optimal Diabetes Care 2015”):

The number of diabetes patients who met ALL of the following targets:

  • The most recent HbA1C in the measurement period has a value of <8.0
  • The most recent blood pressure in the measurement period has a systolic value of <140 and a diastolic value of <90 (both values must be less than)
  • Patient is currently a non-tobacco user
  • If the patient has a co-morbidity of Ischemic Vascular Disease, the patient is on daily aspirin OR an accepted contraindication (any date). Diagnosis of Ischemic Vascular Disease; ICD-9 diagnosis codes include: 410.00 to 410.92, 411.0 to 411.89, 412, 413.0 to 413.9, 414.00 to 414.07, 414.2, 414.3, 414.8, 414.9, 429.2, 433.00 to 433.91, 434.00 to 434.91, 440.0, 440.1, 440.20 to 440.29, 440.30 to 440.32, 440.4, 440.8, 440.9, 444.01 to 444.9, 445.01 to 445.89.

With the exception of the ICD codes specified to determine the co-morbidity of Ischemic Vascular Disease, none of the other criteria can be found on a claim. Most EHR implementations do contain these data elements (A1C values, BP, tobacco status, and prescribed medications), and many organizations with the reporting and analytic infrastructure in place can calculate and submit quality measures like this without relying on manual chart abstraction.

Another issue with claims-based reporting solutions relates to the typical latency of data refreshes. Most providers report refresh intervals of 30, 60, or even 90 days – far too long to affect care delivery at the patient level. Contrast this with EHR data, which is generally available in the EDW between 12 and 24 hours after it is saved to the EHR database.

To summarize, claims data can tell you who your patients are and roughly what happened to them, but it can’t tell you how they are doing in many cases; it’s not a good indicator of health and whether the things you are doing are having a positive or negative outcome.

The Best of Both Worlds

One of the best practices for deploying a robust healthcare reporting and analytic infrastructure is to ensure that your solution can handle both claims and EHR data. To accomplish this, the recommended architecture will likely consist of an enterprise data warehouse (EDW) that you can use to simplify the reporting needs for common analytic use cases across the provider organization. You also need good data governance processes, with engaged data stewards, to help ensure the right tool (claims vs EHR data) is being used for the right job.

Providers need a solution for working with both sources of data. One of the hardest things to do is to take those two sources of data and use them together. If you invest a lot of time in one or the other, and it doesn’t have a way of bringing both together strategically, then you’re going to duplicate efforts and set yourself up for a scenario where two leaders approach a senior leader with different numbers from different systems. We recommend a platform approach where you bring all systems (claims, EHR, patient experience) together quickly and use that for an individual analysis—like claims-only analysis, or a combined claims and EHR analysis—to look at the continuum. But this is harder than people think. Some parts of the organization want to move really fast based on the data they have, but they find themselves needing to support a lot of siloed systems separated primarily by the type of data or the part of the organization out of which the data is managed. If the claims group doesn’t think about the other needs of the organization as it relates to EHR data plus claims data, they might be making decisions that have a higher cost of ownership.

Loading next article...