How to Integrate an EHR into a Healthcare Enterprise Data Warehouse in Just 77 Days

Article Summary

Integrating EHR data into a healthcare enterprise data warehouse (EDW) can take years, depending on the EDW platform and data model. Crystal Run — a physician-owned medical group in New York with more than 300 physicians in 40 medical specialties — couldn’t wait that long.
They need a solution that could integrate their EHR data in a matter of months, not years. Using a late-binding model, Crystal Run was able to integrate their EHR data in just 77 days, with easy-to-use tools for data acquisition and storage and metadata management. 


Four Crystal Run Physicians

“Health Catalyst’s tools make the process of moving EHR system data into an EDW so much more efficient than I’ve seen in the past. I’ve been involved many times in adding data sources to an EDW, and it can be a challenge — particularly in a traditional data model. Catalyst’s tools and its late-binding data model introduce a level of flexibility and scalability that has really made our process run smoothly.”

 – Lou Cervone, Director of Business Intelligence

Featured Outcomes
  • Deployed scalable, flexible EDW platform
  • Developed EHR system data mart within 77 days
  • Integrated clinical data into Essentials Applications

Crystal Run — a physician-owned medical group in New York with more than 300 physicians in 40 medical specialties — has a vision of improving performance by delivering data-driven insight to its clinicians. In fact, Crystal Run’s success as an accountable care organization (ACO) participating in the Medicare Shared Savings Program will hinge on its ability to integrate clinical, financial and operations data and get it in the hands of personnel on the frontlines of care who make decisions that affect quality and cost.


Crystal Run’s journey to create their technology infrastructure began over a decade ago when they installed an electronic health record (EHR) system across the enterprise. Although the EHR improved processes and documentation, Crystal Run quickly came to the realization that this transactional system was not architected to perform the sophisticated analytics the organization needed to deliver cost-effective and quality patient care. Furthermore, it was only one source of collected data. The EHR could not integrate disparate data from a wide variety of sources, including billing, financial and patient satisfaction systems, all of which are required to deliver optimal patient care.

The medical group’s leaders determined that to perform analytics successfully they needed a healthcare enterprise data warehouse (EDW). An EDW would integrate data from the EHR system as well as other sources and make it available for analytics purposes. They decided to build their own EDW, and for several years this system met the group’s analytics needs.

But as regulatory and reporting requirements matured — particularly after the organization became an ACO — Crystal Run found that this homegrown EDW was too difficult to maintain and couldn’t keep up with user demand for data, insight and reports. Crystal Run decided, instead, to transition to an EDW platform solution from Health Catalyst.

The group practice’s decision to buy rather than build a new EDW platform is outlined in another paper, “To Build or To Buy a Healthcare Enterprise Data Warehouse? Why Buying Makes Sense.” This story details how the Health Catalyst solution enabled Crystal Run to effectively integrate their EHR system’s data into an EDW within 77 days.


A basic function of any EDW is moving data from source systems, such as an EHR, and integrating it into the EDW architecture — a process known as extract, transform and load (ETL). This step, although foundational, is easier said than done. Having maintained its own EDW for almost a decade, Crystal Run was acutely aware of the complexities of bringing source system data into an EDW.

The Business Intelligence (BI) team that built and maintained Crystal Run’s homegrown EDW relied on mostly manual processes to bring data into their EDW. Data loads required hand-customized SQL scripting. Moreover, the team was unable to bring transactional files incrementally into the EDW. Each update required a resource-intensive full data load.

The amount of manual effort continued beyond the ETL process to the reporting process itself. Each time a clinician requested even a slight modification of an existing report, the team had to create a new or modify an existing stored procedure to obtain and massage the data in the format they needed and then finally create the front-end report. Over many years of changes and updates to code — without structure and standards in place — the maintenance cost became burdensome. New or updated information requests were also time consuming. The clinicians were also not able to refine the report to accommodate any subsequent information needs due to the static nature of the reports.

Barraged with expanding ACO reporting requirements and information requests from clinicians, the BI team simply didn’t have the bandwidth to continue with these manual processes. They needed a scalable solution that could more easily add and update data sources, and they needed self-serve analytics functionality that would enable clinicians to interact with the data and run reports themselves.


One of the reasons that Crystal Run selected the Health Catalyst EDW platform was Health Catalyst’s deep healthcare data warehousing expertise. Crystal Run knew that Health Catalyst was managed by a group of healthcare veterans who had spent decades developing data warehousing and quality improvement models that had proved successful in real-world implementations. One characteristic that particularly appealed to Crystal Run was Health Catalyst’s proven record of integrating data from several of the top EHR vendors into its EDW platform. The medical group’s leaders were confident that Health Catalyst could quickly integrate data from their EHR and resolve their ETL challenges.

Another important factor in Crystal Run’s selection of Health Catalyst was its unique late-binding data model. The late-binding approach gives Crystal Run maximum flexibility for using data to tackle a variety of use cases as the need arises. This means that rather than having to develop the entire data model up front before knowing what all the use cases for the data would be, Crystal Run could bind the data late in the process — just in time to solve an actual clinical or business problem.

A key tenet of the late-binding model is to move all source system data first into “source marts,” which are data marts for each source system. Health Catalyst’s architecture performs minimal transformation of the data in this process — just enough to link it to  a data bus of core, stable data elements like patient ID, provider ID, encounter date and time, facility ID, etc. The source marts then serve as a foundation for analytic data. Crystal Run can combine data from multiple source marts into other data marts to perform sophisticated analytics on the effectiveness of its clinical care and operations.

Crystal Run’s first priority was creating a source mart for its EHR data. Health Catalyst delivered automated processes and easy-to-use tools for data acquisition and storage and metadata management. These processes and tools solved Crystal Run’s problem of integrating EHR data into the EDW manually.

Data acquisition and storage

Health Catalyst streamlined and simplified Crystal Run’s process for bringing data into the EDW. Unlike the labor-intensive, manual processes Crystal Run had relied on, Health Catalyst’s data acquisition and storage processes were largely automated. One of the most important factors in simplifying the process was Health Catalyst’s Source Mart Designer tool. This tool significantly automates the process of integrating data from source systems into the EDW’s source marts. The application:

  • Analyzes the structure of data sources and recommends data types and column names that conform to the EDW’s technical standards
  • Captures the mapping of source system tables and columns to source mart tables and columns as metadata, which maintains continuity and traceability between the source system and the destination source mart
  • Converts data types from the native source system type to the native types of the EDW
  • Suggests descriptive column names that can be easily modified, enabling the clinicians and the BI team to easily understand the source of the data when performing analytics, as shown in Figure 1
Sample visualization of Source Mart Designer column descriptors
Figure 1: Sample Source Mart Designer column descriptors
  • Enables the team to select whether the data load is full or incremental. The BI team can select if the data load is full or incremental based on the volume of data and speed requirements. If they choose an incremental load, they can
  • identify which fields they want to validate (as shown in Figure 2 in the row/validation section in the right column).
Sample of Source Mart Designer ETL
Figure 2: Sample Source Mart Designer ETL

Not only has Source Mart Designer enabled Crystal Run to efficiently create a new source mart, but it will also enable them to maintain well- structured, consistent data fields within the EDW — regardless of how many staff members are involved in the process of moving the data.

Metadata management and metadata engine

Health Catalyst’s solution also delivered tools to facilitate and simplify the metadata management tasks of Crystal Run’s BI team, as illustrated in Figure 3. Many enterprise data models — including that of Crystal Run’s homegrown EDW — contain no metadata. This lack makes it difficult, if not impossible, to know where data in the EDW originated in the source system. Such insight is essential for effectively managing an EDW, and querying the data. The metadata engine also enables analytics to be performed more quickly by providing clinicians with descriptive names.

Sample of Source Mart Designer query
Figure 3: Sample Source Mart Designer query

Health Catalyst’s web-based tool for metadata management maintains continuity and traceability between the source system and the destination source mart. As such, it delivered critical functionality for Crystal Run to easily develop and maintain source marts in its new EDW. The metadata management tool stores, searches and displays metadata, including data mart, table and column descriptions for both the source systems and source marts. It also manages the metadata for the mapping of tables and columns from source systems to source marts in the EDW.

In addition, Health Catalyst delivered a Source Mart Engine that leveraged the extensive metadata in the metadata management system to automate and parallelize the nightly ETL processes that move data from the originating source systems into source marts. This technology maximizes the throughput of the nightly ETL process and the efficiency of the underlying hardware and system while also minimizing the amount of human interaction and intervention required on the part of the Crystal Run BI team.


Deployed scalable, flexible EDW platform

Health Catalyst and Crystal Run successfully deployed a scalable, adaptive EDW to replace the homegrown EDW solution. The new EDW’s late-binding architecture gives Crystal Run maximum flexibility for using data to tackle a wide variety of use cases as the need arises. Rather than having to establish an enterprise-wide data model up front before knowing what all the use cases for the data will be, Crystal Run can bind the data late in the process to solve actual clinical or business problems as they arise. The EDW adapts to rapidly changing vocabularies, standards and new healthcare analytics use cases. Importantly, Crystal Run’s new EDW doesn’t take long to implement and therefore delivers a rapid time-to-value.

The new platform also delivers self-service analytics functionality to Crystal Run clinicians. Rather than contacting the BI team for a report, clinicians can access robust data visualizations through the platform that enable them to drill down into and filter data based on their specific information needs.

Developed EHR system data mart within 77 days

Health Catalyst worked hand-in-hand with Crystal Run’s BI team to examine which of the EHR’s 7,000 tables needed to be included in the clinical data source mart. These tables included clinical, financial, lab, medication and accounting data from the EHR system. They then began the process of using Health Catalyst’s data acquisition and metadata tools to move this data into the EHR source mart. The team completed integration of this data into the EHR source mart in just 77 days — a process that could have taken several years using a manual process or a traditional, early-binding data architecture that requires modeling all data relationships up front.

Integrated clinical data into Essentials Applications

With EHR system data integrated into the EDW architecture, Crystal Run’s analysts are now able to view the data in Essentials Applications. These applications enable the team to perform foundational measurements to kick-start their improvement initiatives. These measurements include such things as exploring patient populations, creating cohorts and identifying which clinical programs may be performing better and which have room for improvement in cost or quality. The applications also provide data on which clinical areas and problems Crystal Run should address first for the most potential benefit.


The Source Mart Designer enables Crystal Run’s BI team to rapidly design, develop and deploy unlimited source marts. Using this tool, the team plans to integrate other data sources — such as patient satisfaction, billing and costing — into the EDW.


Health Catalyst is a mission-driven data warehousing and analytics company that helps healthcare organizations of all sizes perform the clinical, financial, and operational reporting and analysis needed for population health and accountable care. Our proven enterprise data warehouse (EDW) and analytics platform helps improve quality, add efficiency and lower costs in support of more than 30 million patients for organizations ranging from the largest US health system to forward-thinking physician practices. Faster and more agile than data warehouses from other industries, the Health Catalyst Late- Binding™ EDW has been heralded by KLAS as a “newer and more effective way to approach EDW.

For more information, visit, and follow us on TwitterLinkedIn and Facebook.

Read More About This Topic

  1. What Is the Best Healthcare Data Warehouse Model?
    Steve Barlow, Co-founder of Health Catalyst and Senior Vice President
  2. Additional Healthcare Customer Stories
How to Reduce Unnecessary Elective Deliveries: A Powerful Case Study

This website stores data such as cookies to enable essential site functionality, as well as marketing, personalization, and analytics. By remaining on this website you indicate your consent. For more information please visit our Privacy Policy.