Late-Binding Data Warehouse
Designing for Analytic Agility and Adaptability
The idea of late binding in data warehousing borrows from the lessons learned in the early years of software engineering.
In the late 1990s, after witnessing the failure of several multimillion dollar data warehousing projects in the US military, Dale Sanders, was sponsored by the Pentagon to study advanced decision support in time-critical, life-critical situations, specifically nuclear warfare decision support. He turned to the healthcare industry for what he expected to be role model examples of computer-aided analytics to drive better decisions in critical situations, but instead found almost no examples, with the notable exception of a scattered few at Intermountain Healthcare in Salt Lake City. Intermountain clearly possessed the culture and willingness to fully leverage data for improving care, but the industry at large was many years behind.
Sanders saw the same patterns in data engineering as those in software engineering prior to object oriented programming.
Early and tight binding of data to rules, models, and vocabulary led to unnecessary complexity that delayed time-to-value; and led to a very fragile and inflexible data warehouse infrastructure that could not adapt to rapidly changing analytic use cases or new data content.
Later during that decade, Steve Job’s rise with NeXT computing gave way to commercial, large-scale adoption of Alan Kay’s Object Oriented Programming approach. By then, late binding development became the norm for software development. During that same decade, Dale Sanders began espousing the use of late-binding within his data warehousing work while in the military and then to healthcare while he was the chief architect of Intermountain’s EDW in the 90s, later at Northwestern University Medical Center and most recently at the Cayman Islands Health Authority. The approach is now fundamental to the data warehouse that Health Catalyst implements for its clients and has resulted in rapid 90-day implementations and substantial clinical and financial results in a matter of months, rather than years that are required using alternative data warehousing approaches that are common outside of healthcare.
Sanders set out to develop the Late-Binding Data Warehouse architecture which is balanced between the extremes of early binding in Inmon, Kimball, and I2B2; with the “no binding” environment of Hadoop.
Sanders’ “late binding” data engineering concept is now fundamental to Health Catalyst’s data warehouse platform. The Late Binding Data Warehouse enables time-to-value that is measured in days and weeks, not months and years, and has proven many times more scalable and adaptable to new analytic use cases and data content than the methodologies that utilize early binding, tightly coupled enterprise data models.
Knowing what to bind and when in the flow of data in a data warehouse requires more than technical skills; it requires a strategic understanding of the short and long term evolution of the entire industry. It requires an understanding of the historical volatility in vocabulary and business rules as well as an ability to predict the rate and specifics of volatility in the future. Healthcare is undergoing changes to business rules and vocabulary at an unprecedented rate.