This article is based on a 2019 Healthcare Analytics Summit presentation by Brent James, MD, MStat, entitled, “Designing Effective Clinical Measurement: Recognizing and Correcting Common Problems.”
Dr. W. Edwards Deming, the father of quality improvement theory, famously noted that “the aim defines the system.” This is especially true for clinical measurement systems. Despite the proliferation of quality improvement programs that providers and hospitals must navigate, clinical quality measurement does not always translate to outcomes improvement. When designing a clinical measurement system, there are only two potential aims: measurement for selection or measurement for improvement. Understanding the difference between these two aims, as well as the connection between clinical measurement and improvement, is crucial to designing an effective clinical measurement system.
While delivering high-quality clinical care to patients is a common goal among healthcare professionals and health systems alike, these two approaches to quality improvement—measurement for selection vs. measurement for performance—are fundamentally different and produce highly uneven results.
Measurement for selection is a judgement-based system that typically generates comparative data showing how providers, departments, or hospitals rank within a healthcare system. Data associated with measurement for selection is often captured through unfunded data mandates, such as national CMS ranking measures, or any of the more than 150 groups that evaluate healthcare delivery in judgement-based systems. A measurement for selection system has an internal aim to motivate (or shame) care providers into providing better care. Additionally, a measurement for selection system assumes:
The reality is that none of these assumptions are true. Most clinical measures can’t accurately rank performance due to fundamental problems with the underlying science, assessment methods, and data identification and extraction. For example, David Eddy, a physician, mathematician, and healthcare analyst was helping the National Committee for Quality Assurance (NCQA) measure prenatal care as a measure of clinical competence in evaluating care delivery systems. The outcomes statistic measured in this evaluation was perinatal mortality.
Evaluating and ranking practices on the basis of perinatal mortality makes the implicit assumption that the factors that determine perinatal mortality are equal among these practices when, in fact, he found 75 percent of the determining factors were unknown. These underlying problems result in clinical rankings with very wide confidence intervals, meaning that the data is consistent with a wide range of possibilities.
There are roughly 150 different groups that produce hospital rankings; many of them use the same nationally available data sets, but they report radically different results. According to Paul Keckley during his time at Deloitte, of the 4,655 acute care hospitals in the U.S., 1,428–or 30 percent–were rated a “Top 100” (top two percent) hospital by at least one external ranking group in 2013. The bottom line is that clinical ranking data is often unreliable.
Knowing this, hospital systems performing comparative analysis to help guide improvements in care delivery need to ask if they’re making the same mistakes as national ranking systems. Research shows that comparative outcomes do motivate people, but not in the way healthcare systems want or need. Deming noted that when people are pressured to meet an external target, they can work to improve the system, suboptimize the system by working harder, or game the data. As pressure increases, reliance on suboptimizing the system and gaming the data increases disproportionately to improving the system. Additionally, increased pressure also produces goal displacement, where instead of working to produce excellent patient outcomes, care providers start focusing on ranking highly on external profiles. Recent examples like the Department of Veterans Affairs (VA) waiting list scandal and the Wells Fargo credit card scandal show that it’s almost always easier to improve rankings by manipulating the measurement system than by improving performance.
The fundamental difference between measurement for selection and measurement for improvement is that, instead of focusing on person-level performance, it focuses on process. A measurement for improvement system also uses internal operational data, integrates data capture into care providers’ workflow, and has an internal aim of making it easy to perform correctly, rather than motivating care providers.
This system generates very different data sets than measurement for selection. These data sets are evidence-based, optimized for process management and improvement, and are more clinically focused. The primary aim of measurement for improvement is creating a transparent system that’s embedded into clinical workflows in ways that do not damage clinical productivity. The key to making this system effective is tracking the right data elements and organizing data for improvement.
If the primary aim of the clinical measurement system is to improve healthcare delivery, then data systems must be designed around frontline work. Organizational structure and data flow most follow the core work processes, allowing data and reporting to roll up from the patient level to the care team, from individual departments to hospitals, and finally, healthcare systems and health plans. If hospital systems can do this effectively, they end up with more accurate, complete, and timely data for ranking than if they explicitly pursue ranking data.
Researchers often believe that obtaining good data depends on asking good questions. The same is true for care delivery performance: a focus on delivering healthcare outcomes generates the right data by identifying useful and necessary data and then embedding and capturing that data within the care workflow.
There are three different methods healthcare systems use to identify what data to track:
The best method for identifying data for clinical measurement is using structured expert opinion. To do this, health systems should follow this seven-step process:
Gauge theory, which underlies quality improvement, posits that measured performance is always a combination of actual performance plus the measurement system. A measurement evaluation system can tell healthcare systems how much noise is in the measurement system versus what’s actually happening at the front line of care delivery. When health systems encounter outliers in their data system, they need to be able to determine if the outlier arose from the measurement system–or gauge–or from the care process being measured. When the outlier is the result of the data system, the organization must fix the data system through continuous improvement. This translates into a robust, accurate, and complete data system over time.
As healthcare organizations strive to provide better care for patients, it’s essential they have an effective clinical measurement system with which to monitor their progress. In designing and refining this system, organizations need to be careful not to fall into the common pitfalls of clinical measurement. A system that aims to measure outcomes for improvement, rather than for selection, is the key to effective measurement. Focusing on process, rather than person-level comparative data, is crucial to tracking the right data elements and organizing data for improvement. Given the complexity of clinical care, any outcomes measurement strategy should always include a mechanism for data system validation and a feedback mechanism for the continuous improvement of the data system itself. An effective clinical measurement system and strategy is necessary for truly improving healthcare outcomes and care delivery.
Would you like to learn more about this topic? Here are some articles we suggest:
Would you like to use or share these concepts? Download the presentation highlighting the key main points.