Data Mining: A Data Quality Approach

Summary

Health Information Exchanges (HIEs) have always been about the data. The very purpose of these organizations is to provide the data that supports better health care. And, having as much information as possible about that data is making obvious benefits for daily operations, data governance and business development.

Downloads

Download

The Cures Act Adds an Additional Data Governance Layer

The 21^st Century Cures Act’s Information Blocking Rule is changing the narrative around health data to increase the sharing of data for better patient access and to foster innovation and competition in the industry. For health care organizations, it is giving them the opportunity to review their existing information sharing policies that were developed to comply with HIPAA and other regulations, and potentially make changes to how data is shared with other providers, health IT developers and information networks. Accordingly, an easy place to start data quality efforts is by aligning with the latest data and interoperability standards, like using USCDI as a data quality framework per the requirements of the Information Blocking Rule.

Given the nature of the HIE, it can be quite challenging for HIEs to understand their own data and to assure compliance objectives are met for themselves as well as support their providers’ adherence. As a result, data mining can be used to gain a deeper understanding of the data not only to align with new requirements but to have a greater impact on the likelihood of the data being used to drive better health care.

Data Mining: Continuous Process, Enhanced Results for Evolving HIEs

The data stores of HIEs are vast and should be commended for its comprehensiveness. Yet, because of this immense data, there needs to be a more automated approach to understanding, monitoring, and improving a HIE’s data set. Here at KPI, we prefer to make it as easy as possible for HIEs. The key components of a data mining strategy are:

Start by profiling the data. Data profiling is a data assessment – the process of analyzing the data to better understand the structure, completeness, and quality. By using the Information Blocking Rule’s requirements and USCDI as a baseline, the quality of data can be measured at different levels and baselines to better understand the current state.
Create more usable data by parsing the files. Now equipped with an understanding of the data set, HIEs can focus on improving the quality of the data for better use and application. One strategy is by using parsing technology. While there can be some value to exchanging files as is, most of the benefit of data exchange comes from being able to parse the files into discrete data elements, like how we parse PDF versions of C-CDAs into discrete data elements to provide analytics that support the needs.
Transform the data into industry standards. Deciding to transform data into industry standard formats, such as ICD10, RxNorm, and SNOMED, can serve to enhance the data quality. To give one example, you may have an organization that uses local procedure codes while most other organizations use standard CPT codes. Without standardization there is no way to unite this data together. So, applying standardization across the data ensures that each data element is independently addressed to strengthen the data set as a whole.
Apply data cleansing strategies. In similar theme of using technology to create more usable data, utilizing cleansing strategies takes the data to the next level. This is a diligent process of understanding an organization’s workflows, the data feed, how the data is represented and building related table mappings and bindings. It can be as straightforward as all “M” values equal “Male”, or as complex as normalizing how weight is represented as lbs, kgs, or oz across data sources.
Create data relationships for record matching. As HIEs serve as a community’s data network, it’s perhaps not surprising to see the need for robust data matching. A technology partner like KPI Ninja can apply a variety of attribution models to link patients to their providers, as well as refine relationships between data fields and elements to optimize the data use, like how we can connect laboratory data from HL7v2 data segments and CCD sections to create a more unified understanding of patients’ test results.
Apply sophisticated data enrichment. While some form of enrichment has already been applied through each of the previous steps of standardizing, cleansing, and matching, it’s within this step that we apply more sophisticated data enrichment strategies like Natural Language Processing or Machine Learning. We understand that not all organizations have the time or resources to deploy this type of technology but that’s where we come in. We understand the obstacles HIEs are up against, and we seek to do the heavy lifting from the technical end to deliver the tools that are needed to enrich the data to support more data-driven health care.

Data mining is a continuous process, and there is plenty that can be done in a data quality strategy to advance the positioning of HIEs as population health enablers. I hope this blog provided you a small glimpse about what can be possible when data quality efforts, like data mining, are aligned to clinicians’ data use cases. Check out the full recording of the presentation that we gave at SHIEC’s 2021 Annual Conference to see how we deployed this exact strategy with one of our HIE clients.