Patient Identification and Matching—An Essential Element of Using an Enterprise Data Warehouse to Manage Population Health


patient matching  “Solving the patient-to-patient and provider-to-provider matching problem is one of the top accomplishments of the EDW initiative. The main reason we implemented the EDW in the first place was to better connect disparate sources to enable analytics. This matching—all on a single platform—makes that possible.”

– Bill Alberghene
Data Architecture Manager
Enterprise Data Warehouse
Partners HealthCare

In a healthcare industry transitioning to value-based reimbursement and population health management (PHM), matching patients accurately to their care events across multiple sites of care and sources of information is becoming ever more important.1 Similarly, as providers find the need to interact with more insurance carriers and provide care in multiple institutional settings, tracking utilization by provider becomes equally problematic. Being able to accurately track utilization of services for a particular patient, patient population, or provider is fundamental to the strategies underlying effective population health management. Unknown costs cannot be managed, and effectiveness of interventions and treatments cannot be determined if they can’t be seen in the context of other services rendered.

Not surprisingly, matching records to the correct individual is more complicated when patients receive care in multiple settings and when organizations and providers use different systems to share records electronically.2 Even within a single healthcare organization, the wide range of systems for clinical and administrative services can create obstacles to matching records.

In keeping with its commitment to continuous improvement, Partners HealthCare was determined to improve the accuracy of matching its patient records to patient records, and provider records to provider records. A large, non-profit healthcare system founded by Brigham and Women’s Hospital and Massachusetts General Hospital, Partners includes community and specialty hospitals, a managed care organization, a physician network, community health centers, home care, and more.


With so many entities under the Partners umbrella, the health system was experiencing resource utilization tracking problems born of having patients and providers with different identifiers across and within numerous data sources inside their own organization. Additionally, they were wrestling with the need to identify utilization outside of their organization using claims data. In order to get an accurate picture of the benefit and costs of population health management, it was necessary to understand total resource utilization by patient, and by provider. Partners began looking for a way to improve patient-to-patient and provider-to-provider identification and matching that could be quickly implemented and lead to near-term improvements in matching rates.

As part of its overall analytics strategy, Partners developed a single repository of clinical, operational, financial, and claims data. This repository—an enterprise data warehouse (EDW) platform— aggregates data from different source systems to create a consistent view of data collected across the system. Partners also determined the need to launch an advanced population health management strategy and to implement an analytic structure to support it. Creating an accurate foundation of patient matching was a critically important factor in this overall strategy. Accurate patient-to-patient and provider-to-provider matching was essential for being able to accurately relate data within the EDW, as well as for matching claims data with clinical data. It was also important for matching claims data from multiple carriers that are associated with the same patient or provider.

An example of the complexity of developing a patient-matching solution can be found in attempting to refine the reporting for risk contracted populations. Partners’ payer data warehouse included patients with a Partners primary care provider and only those payers with which the health system had a risk-based contract. At the same time, the health system’s billing data warehouse contained information on utilization regardless of payer (for example, it might include information regarding an out-of-state patient ED visit). Without a method to link these warehouses using a common patient identifier, Partners couldn’t accurately identify and link patient encounters.

Without algorithms that could sufficiently handle varying identifiers, the identification and matching of records across Partners’ source systems and existing data warehouses required a great deal of individual expertise and effort. It was also a very manually intensive process, which consumed a lot of time and introduced the possibility of manual error. This approach was neither scalable nor sustainable.


It was clear that Partners needed to develop an automated solution to meet its patient and provider identification and matching challenges. Partners decided that they needed to have a single EDW identification (EDW ID) for each patient, and a single EDW ID for each provider that could be used across the EDW and in their analytics applications. It was essential that users not be required to know a specific combination of identification numbers to make their systems work and tie information together. Having a common identifier would increase the adoption rate of analytics tools, and insulate users from needing to know the specific identifiers used by each data set in order to get and link the information they needed.

In support of its overall clinical transformation strategy, Partners started down the path of building a robust, data-rich analytic environment based on four components: an EDW, user-friendly analytic tools, services and education, and a strong governance structure. Each component had an element that was dependent on, or impacted by, the accurate matching of records to build a reliable source of information. As Partners designed and implemented its EDW with integrated views of clinical, financial, operational, and claims data, the team established a common patient/provider identifier across the EDW to accurately correlate data from these various sources. Algorithms were developed using logic from the multiple identifiers in use across the disparate systems so that accurate matching occurs behind the scenes, and without the need for manual intervention.

This solution required a complex mapping scheme and process to accurately identify and match patient to patient and provider to provider across systems. The patient master data flow map is shown in Figure 1. A similar data flow map was developed for provider matching.

patient master data flow partners

Figure 1: Patient master data flow

The use of the single identifier paved the way for analytics tools that could be easily used, and linked applicable information in a way that gives accurate and reliable answers to queries. Partners proactively communicated the common identifier solution to create awareness of the value proposition of correlated data and educated staff not to rely on familiar manual processes. Accurate identification and matching increased trust in the data among a wide range of users. A strong governance structure reinforced trust in the data, and established confidence in the data matching to help support data driven decisions.


Developed an effective patient and provider identification and mapping solution for more than 10.5 million patients

Partners considers patient and provider matching to be a top accomplishment of the EDW initiative. Accurate matching allowed Partners to connect more than 10.5 million patients across sources and facilities and enabled the use of advanced analytics. Accountable Care Organization (ACO)/shared risk and population health analytics across source systems is supporting better care coordination and care management. For example, in one of Partners’ advanced analytics applications, staff can identify a group of patientsthat merit further investigation in order to ensure better management of care bundles. With the new patient-matching capabilities, staff can pull data from other sources to better understand the utilization patterns of this targeted group of patients.

The new EDW matching identifier also supports better management of patients as they move from one payer to another. Historically, Partners’ data systems would treat a patient as two different patients if he or she changed health insurers (for example, switched from Blue Cross/Blue Shield to the Medicare ACO). Partners is in the process of revising reporting using the EDW patient ID so that the team can capture all of the patient’s utilization regardless of migration from one payer to another. This functionality will bring in additional history on a patient that was not previously available. Because utilization of emergency department and inpatient services is often a key predictor of likelihood for readmissions, having this additional information will enable Partners to identify more patients that meet the criteria for interventions like care management.

Realized up to 20 percent improvement in patient matching accuracy

Partners has achieved a much better patient-to-patient match rate today than in the past, especially for sicker patients who have more encounters. In fact, Partners estimates a 10-20 percent improvement in the accuracy of patient identification and matching. Not only has accuracy improved, but valuable staff spend less time performing manual matches—which frees them to apply their expertise to tasks that directly improve quality and cost.

Achieved 96-99 percent high-risk patient matching rate

For high-risk patients, Partners achieved patient identification and matching rates as high as 96-99 percent. This accuracy was important because these patients are part of the care management program and represent a population that must be managed closely.

Enabled EDW integration of high-risk flag in support of care management

A very high match success rate has allowed Partners to confidently add flags in its care management systems and EHRs to identify patients based on criteria and data from many information systems. These flags make it easy to identify high-risk patients, which helps the care team effectively manage care and improve outcomes, especially for patients with complex, chronic diseases.


Partners will refine the patient-to-patient and provider-to-provider identification and matching algorithms to continuously improve identification and matching rates in the future. The team will also further expand the utilization of the common identifiers underlying patient matching to improve integration and insight across the advanced analytic applications.


  1. American Health Information Management Association. (2009). Managing the integrity of patient identity in health information exchange. Retrieved from:
  2. (2014). Patient identification and matching: Final report. Retrieved from:


Health Catalyst is a mission-driven data warehousing and analytics company that helps healthcare organizations of all sizes perform the clinical, financial, and operational reporting and analysis needed for population health and accountable care. Our proven enterprise data warehouse (EDW) and analytics platform helps improve quality, add efficiency and lower costs in support of more than 50 million patients for organizations ranging from the largest US health system to forward-thinking physician practices.

For more information, visit, and follow us on Twitter, LinkedIn, Google+ and Facebook.

Loading next article...