Demystifying Healthcare Data Governance: An Executive Report
Collectively, the CMO, CNO, and CMIO represent the data origination and data analytics needs of the clinicians in the organization, particularly as it relates to the EMR. The CFO tends to own the financial and supply chain data and sometimes human resources. If the organization is an academic medical center, it is very important to include a chief clinical research officer in the data governance committee to represent the researchers’ data management and analytics needs.
Data Governance Committee Failure Modes
Data governance committees typically have four failure modes:
- Technical Overkill
- Red Tape
Failure Mode: Wandering
Data governance committees that wander usually do so because they lack something tangible to govern, and they lack the experience to recognize their wandering. These committees have heard about data governance and have seen other organizations engaged in data governance but have a difficult time translating that into tangible value in their organization. As a consequence, the data governance committee meetings become a waste of time, and over time, the members stop participating. Eventually, the data governance committee ceases to exist, either formally or functionally. This is why the data governance committee should also serve as the steering committee for the EDW and analytics. Doing so provides the steering committee with a tangible sense of purpose, reducing the risk of wandering aimlessly. By initially focusing the mission of the data governance committee on the success of the EDW, the members of the committee will grow their data management and awareness skills, which will, in turn, enable the committee to grow its scope of influence beyond the EDW, over time.
Failure Mode: Technical Overkill
Technical overkill is common. Typically, in these scenarios, a well-intended and overly passionate CIO, who is inexperienced with data management, organizes and chairs the data governance committee. CIOs in this situation will often gear the agenda to very granular data management topics, such as data types, master data management, and naming conventions in the data warehouse. Unfortunately, these agendas do not play into the strengths and passions of the other executives and C-levels on the data governance committee. If allowed to continue with these agendas, members of the data governance committee will slowly start to pull away from participation, leaving the CIO as the lone attendee. Data governance committee meetings will get canceled because of “higher priorities,” and the data governance committee will slowly wilt away. To avoid this scenario, the agendas for the data governance committee should stay at the strategic level, and subcommittees should be formed to deal with the details mentioned above. For example, the committee should work on establishing an internal communication, marketing, and implementation plan for becoming a data literate culture; knocking down barriers to data access; endorsing and supporting projects to improve data quality; assessing the organization’s progression in fully exploiting data for quality improvement and cost reduction; resolving high-level conflicts and priorities related to data analytics; and supporting the development of budgets and strategic plans that further the mission of becoming a data-driven organization
Failure Mode: Politics
Nothing is more politically or culturally disturbing than the journey toward becoming a data-driven organization. As I frequently say, “Data takes courage.” The spotlight of data can be very uncomfortable for those organizations with a culture that has been operating by anecdote and the laurels of reputation. As a result, initial political infighting is very common when initially becoming a data-driven culture because the spotlight of data reveals awkward truths.
One of the most common manifestations of political infighting appears in passive-aggressive participation in the data governance process. Members pretend to be data-driven and selfless during committee meetings, but when they leave the committee and go back into their departments and domains, they operate in territorial and defensive ways. Rather than advocating for greater access to data and data sharing from their areas of responsibility, they instead operate with mistrust and a sense of data hoarding. In the past, this sort of behavior was, if not openly tolerated, then at least allowed to occur. However, in today’s healthcare market, and even more so in the future, this sort of defensive and territorial behavior around data will not survive nor will the employment of those executives who behave in such a way. The healthcare industry is changing too quickly and too dramatically for anything less than becoming completely data driven by executives who are comfortable in that data-driven world.
Organizations should be prepared for this natural occurrence, not let it detract them from their long-term goals, and move as quickly—and as transparently—as possible to eliminate it. Fortunately, as the organization becomes more data-driven, political infighting becomes less common across the organization because the subjective and endless debates of the past are replaced by objective and precise discussions supported by data.
Failure Mode: Red Tape
In bureaucratic red tape failure mode, which is commonly associated with authoritarian forms of data governance, committee members behave like bureaucrats of the data, rather than governors and stewards of the data, trying to maximize the data’s value to the organization. They construct process barriers to data access, rather than knock them down. They centralize the approval and decision-making process associated for authorizing access to the data warehouse, rather than delegate it to data stewards. These committees spend too much time worrying about the improper use of data and analytics then imposing burdensome processes to ease their worries. Instead, they should spend most of their time talking about the enormous potential benefits of data to the organization and developing processes to unlock that potential. Finally, they confuse the mission of the data governance committee with the data security committee; the data governance committee should always advocate for greater data access while the data security committee tends to push for less.
Data Governance and Data Security
Data governance and data security must be balanced. In healthy cultures, the data governance committee is constantly pulling for broader data access and more data transparency, while the data security committee constantly pulls for narrower data access and more data protection; there is a healthy tension. It is important to create a sense of risk tolerance when it comes to the utilization of data in an organization. Every penny spent on misplaced and unnecessary data security risk avoidance is a penny that could be spent on analytics to directly benefit patient care and the financial health of the organization. Protecting patient privacy is, and always will be, an enormously important part of the data governance committee agenda. But beyond that, the data governance committee should be a very risk tolerant advocate for maximizing the access to, and the exploitation of, their organization’s data.
Ideally, to help balance this tension between the data security committee and the data governance committee, there should be overlapping membership between the two, thus forcing the members to balance the missions of both.
Tools for Data Governance
In order to achieve the Triple Aim of Data Governance, the data governance committee needs reports that expose the quality of the data that they are governing. As mentioned previously, Data Quality = Data Validity x Data Completeness. Data validity can be measured by joining the data that is collected in the primary source systems, such as the EMR and registration systems, to master reference data in the EDW, then identifying the gaps and mismatches between the source system and the master reference data. Data stewards use these reports in their efforts to close the gaps in data quality for the systems of their responsibility. Data completeness can be measured by simply measuring the null values in a data set. If data is not being collected in areas where it should be, this, too, can be addressed by the data stewards associated with the origins of that data.
As the executive sponsors of the EDW, the data governance committee will also need reports for understanding how the data warehouse is being used, from both an audit perspective and a utilization perspective—i.e., is this valuable and expensive data warehouse being used in a highly effective and meaningful way? These reports take on the flavor of a customer relationship management system, helping the data governance committee understand which data is being used most often in the EDW, who is using the EDW, during what time of the day and day of the week, and the volume of data being delivered to analysts. Audit reports for tracking access to patient identifiable data should also be provided and reviewed on a regular basis by the data governance committee.
White-space Data Management
There is always data that an organization needs for analytic use cases that is not being captured in the primary transaction systems. The data governance committee will need tools to address this issue, which I call “white-space” data management. White-space data is the data that exists between the lines of the data normally captured in the primary source systems, or it is captured in a way that is not easily computable, such as text in a clinical note. Quite often, this white space data is manually abstracted and manually integrated on desktop computers using Excel and Access. White-space data management tools replace the need for these desktop spreadsheets and databases by providing an easy-to-use data entry tool that is tightly coupled with the EDW. The white-space data entered through this tool is then naturally integrated with other analytically valuable data in the data warehouse. It is also more secure than Excel spreadsheets or Access databases.
Finally, one of the most important tools to data governance is a metadata repository. The metadata repository serves as the “Yellow Pages” for the EDW, providing data analysts and members of the data governance committee with a tool for browsing the various types of data in the EDW and seeing the attributes of that data. This includes information such as how far back in history the data has been collected, the number of records in the data set, any known data quality problems, and the data stewards who can be contacted for more information. A metadata repository is critical to the democratization and full utilization of the data in an EDW.
Misguided metadata strategies place too much emphasis on the objective computable metadata that can be collected automatically and too little emphasis on the more important subjective metadata that can only be provided by human beings who have been collecting and managing data for a number of years and understand its nuances in ways that a computerized tool cannot. The best metadata repositories contain information that is a combination of human-generated content and computer-generated content in a 50/50 split. The human generated content should be collected and curated in a wiki-style contribution model. The computer-aided metadata should be collected through the database management system and ETL (extract, transformation, and loading) tools.
Healthcare Analytics Adoption Model
Modeled after the HIMSS Analytics EMR Adoption Model, the Healthcare Analytics Adoption Model provides a framework for evaluating an organization’s adoption of analytics. It also provides a roadmap for developing analytics strategies, both for vendors and for internal use by healthcare delivery organizations.
The progressive development of an analytics strategy in healthcare starts tribal-like at Level 0, where little or no data governance exists in the organization, all the way up to Level 8, at which the data governance committee is operating in a very deliberate and formal manner, driving the strategic acquisition and exploitation of data in the organization (e.g., genomics data initiatives).
Progression of Data Governance in the Model
Several patterns emerge as organizations progress through the analytics adoption model. First, data content expands with each progressive level as new sources of data are added to the EDW. Second, the timeliness of data refresh and data-driven decision-making increases, as the organization becomes more agile and comfortable operating as a data-driven culture. The data literacy of the organization increases progressively as well, just as might be expected of students in the progressive completion of a college curriculum. Finally, the complexity of the analytics employed by the organization increases, and the mission of the data governance committee expands from the governance of data to the governance of algorithms and rules about binding their data.
There are six distinct phases that the data governance committee will pass through, as the organization progresses through the Analytics Adoption Model.
Six Progressive Phases of Data Governance
Phase 1 for the data governance committee is relatively simple, but critically important, as it sets the stage and foundation for all other phases and progression. In this phase, the executives on the data governance committee are setting the tone for becoming a data-driven organization. In staff meetings, in emails, in their decision-making behavior, in their priorities of projects, they constantly reinforce the importance of using data to make better decisions, faster. They communicate to their staff that all employees who play a part in data collection and origination are responsible for data quality—i.e., the collection of valid and complete data that they capture in the course of their duties.