The 3 Challenges of Translational and Clinical Research Data Management and a Strategy to Succeed
Editor’s Note: This is the first article in a series that looks at the problems clinical researchers face when dealing with healthcare data and explores ways to help solve those issues. In the next blog of this series, we will share our vision for solving some of these complex problems.
Clinical research is a critical part of creating a learning health system that routinely incorporates the latest treatment guidelines into its clinical care. The efficacy of new treatments and guidelines are studied by researchers who recommend better, more personalized treatment protocols. Increasingly, researchers have been tasked with not only identifying new interventions to create better clinical outcomes, but also with partnering with healthcare delivery systems to implement those new discoveries. This is a fundamental aspect of translational research.
However, researchers are facing problems with their clinical research data management. These issues include accessing clinical data, the inability to efficiently use time and resources when dealing with that data, and translating that research data into everyday clinical practice.
Challenge One: Access to Clinical Data
One critical problem researchers face is an inability to access the data they need. The main barriers to access include technology barriers, regulatory barriers, and organizational barriers.
Researchers need access to rich data stores, but healthcare data is housed in a wide variety of siloed systems which means there is no single source of truth researchers can rely on. This situation can make answering even the simplest research question a big challenge. For example, while creating a grant proposal, a researcher may need a straightforward count of patients in his system with COPD. Unfortunately for many healthcare systems, there is no straightforward way to answer that basic question. Instead, it requires a series of complex queries to pull data from a variety of sources, followed by significant time cobbling together reports. This is simply because the health system’s data isn’t aggregated and easy to access.
Furthermore, health researchers frequently require access to data beyond the health system. In order to gain enough insight into populations, especially those with rarer conditions, researchers need access to broader data sets composed of contributions from multiple health systems.
Once researchers receive funding, they want to get started digging into the data. But regulatory barriers can slow them down. These regulatory barriers (specifically HIPAA, which requires an institutional review board [IRB] to approve data use) are necessary and surmountable, but they add a layer of process and complexity to research.
For example, suppose a researcher creates an IRB application to study subjects with COPD. The researcher is interested in both diagnosed cases of COPD and potentially undiagnosed cases. The IRB application will require the researcher to outline the inclusion and exclusion criteria for the study. The criteria will include patients with a diagnosis of COPD as well as patients exhibiting potential signs of COPD that may qualify the patient for inclusion (chronic cough and wheezing, for example). Writing the IRB application with overly-specific criteria may require the researcher to submit a revision if, during the study, the criteria change.
This also happens when specific diagnosis codes are used in the application. Often, specific codes do not leave enough flexibility for the researcher to iterate on different codes or combinations of codes and other criteria, which is often required to determine a true clinical cohort. This can cause delays in the study and extra administrative burden.
Maintaining the privacy and security of health data is absolutely essential, so health systems are rightly protective of their data, especially if it must leave their four walls to a researcher at an associated academic institution. Whenever data leaves the medical center for research purposes, the medical center must ensure the transfer is secure. It must also cross-check that the data request is consistent with federal and local regulations.
Using the COPD example from above, the researcher may request information from the health system on patients with chronic coughs, wheezing, or a COPD diagnosis. The researcher will often be asked to show his approved IRB paperwork (with the data request) to demonstrate that he has been approved to use this specific data. Usually, a health system will want to review the data request next to the IRB paperwork to ensure that the researcher’s request is within the approved data specification. In many cases, health systems may not have people and processes in place to approve and facilitate appropriate use of data by researchers. In these cases, an appropriate level of data protection can turn into overprotection, and researchers are unable to work efficiently. In the end, patients lose because new discoveries of better care are delayed, along with opportunities to participate in new forms of treatment.
Challenge Two: Inefficient Use of Time and Resources
Lack of appropriate infrastructure can lead to major inefficiencies of both time and resources when researchers attempt to access or use healthcare data. These inefficiencies come in the form of inefficient study recruitment, data cobbling, and materials waste.
Inefficient Study Recruitment
A quick glance at clinicaltrials.gov (“a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world”) shows how many research studies fail due to the inability to recruit enough participants. Recruitment is difficult because finding the right patients requires identifying who they are, where they are, and who their treating physician is (treating physicians act as important gatekeepers in the process). A good recruitment program needs data from multiple systems: EMRs and administrative data for cohort identification and treating physician information, and hospital ADT and ambulatory scheduling data to locate patients. When data from these systems are not reliably available and linked together, recruitment efforts can be long, arduous processes with poor results.
Data cobbling happens when researchers must piece together disparate data sources manually using Excel, Access, or other consumer database software programs. It is wasteful because often this process is time intensive and error prone.
Researchers often must rely on Excel spreadsheets and Access databases to enter study-specific data that may not be found in the medical record. The problem with collecting data using these tools is that the process creates yet another data silo. These siloes are essentially shadow systems that can quickly get out of sync with the dynamic medical data in source systems.
Here is one example of the complications that can arise from this situation. During my tenure at an academic medical center, I was presented with a breast cancer database, a beautifully curated Excel spreadsheet filled with tumor marker data abstracted from pathology reports. The researchers asked me to upload their data into the data warehouse I managed so that we could run an outcomes study of tumor markers. We very quickly encountered a significant problem. The spreadsheet used medical record numbers (MRNs) as the main identifier, but many of those MRNs had changed in the years since the researchers had copied data into their spreadsheet. We were only able to match 65 percent of their patients to ones in the data warehouse. In the long run, we were able to make the matches, but it took a huge amount of manual effort.
Manually entering data is an important part of research, but the data needs to be entered into the right type of data system so that it can automatically link patients into a broader record. The right data collection tool should also interface directly with an enterprise data warehouse so that when the researcher wants to analyze the data, there is no time wasted to integrate data with a larger repository.
Shadow systems like Excel spreadsheets also contribute to materials waste. Consider the example of bio banking. Researchers may have a vast resource of tissues and samples to draw from. These various bio banks are often stored in disparate systems and with limited amounts of clinical data associated with the samples. So, when a researcher wants to find all samples that exist for a clinically defined cohort, it is either difficult or impossible. Researchers don’t know what samples exist, so they may waste time and resources collecting additional samples. Researchers need a repository where all available samples are catalogued and available for querying along with detailed clinical data.
Challenge Three: Translating Research Discovery into Clinical Practice
Another significant challenge that health systems face is making the transition between research and clinical practice—moving from bench to bedside. It takes on average 17 years for discoveries about best practices to become part of everyday clinical care. If a health system has trouble deploying clinical best practices that have been around for many years across its enterprise, it stands to reason that it will also struggle to deploy more-recently discovered guidelines.
Health systems that do not embrace data-driven improvement efforts will struggle to consistently and persistently deploy any clinical practice, whether the clinical practice is a new discovery or one that has been around for 15 years. Routinely integrating the latest research into everyday practice will require cooperation between research teams and care improvement teams.
What Does a Research Program Need to Be Successful?
Many health systems have a research mission–not just academic medical centers. This type of mission is important because it is driving the better care of tomorrow. Organizations that innovate ways to deliver healthcare will differentiate themselves in an increasingly competitive market. Sharing these innovations and discoveries with other providers helps the healthcare industry as a whole become more effective and improves the health of a global population. That is why removing roadblocks for research is so imperative.
Many of the problems outlined above are solvable at a strategic level by creating a synergy between research and care improvement teams–both from an infrastructure perspective and an operational perspective. Organizations that leverage their enterprise data warehouse for research with as much enthusiasm as they have been leveraging their data for clinical and financial analytics will find that through their research discoveries, they can deliver even better care. Organizations that have mastered clinical quality improvement can use the same deployment techniques to deploy latest research-driven guidelines. All of this requires an often-unprecedented level of cooperation between health systems and their academic affiliates.
Next in this series, we will expand on how to solve these problems and how to create a culture of cooperation that encourages quality improvement driven by data.
What are the biggest roadblocks you encounter as a researcher today? What problems are you facing when integrating research into your clinical setting?
Would you like to use or share these concepts? Download this quality improvement presentation highlighting the key main points.