Machine Learning 101: 5 Easy Steps for Using it in Healthcare

Machine learning is a hot topic among healthcare digerati, but it’s still very much a black box for many executive clinical decision makers. It’s been described as the technology to replace physicians, a digital wunderkind for reading images, processing patient data, predicting likelihood of disease, and suggesting treatment options. But is machine learning really the technology revolution that will streamline healthcare? How does it work? Who can use it and how is it implemented? What’s inside the black box?

Previously published stories about machine learning in healthcare have strayed toward hype. While its potential is exciting, I want to give a basic overview of machine learning, clear some misconceptions, detail the steps required for implementation, and set realistic expectations.

Machine learning models typically used in healthcare today synthesize data to generate risk scores that supplement physician-intuited care solutions. Machine learning isn’t being used to detect cancer or treat the most prevalent diseases, but it can be used to make valuable predictions, like length of stay (LOS), readmission risk, or infection risk. By properly setting expectations, it’s easier to understand how to implement the technology.

What Can We Do with Machine Learning in Healthcare?

Broad use of machine learning for healthcare is still down the road, but there are dozens of machine learning models in production, development, and planning stages. Regardless, it’s very reasonable to implement machine learning immediately to start chipping away at some big healthcare issues.

The lowest-hanging fruit is to substitute slower, outmoded risk prediction rulesets with machine learning models. For example, the LACE Index is used to predict 30-day all cause readmissions. LOS, a primary variable in this index, cannot be factored in until the day of discharge, thus limiting real-time use and intervention while the patient is still in the hospital. And the datasets used to establish the scoring system are limited. Machine learning can automatically predict a readmission using highly correlated data, so it’s considerably more accurate. Where a LACE score calls fields from just a handful of variables, a machine learning model can call from hundreds. Using machine learning models instead is more challenging up front—though still easier than using a clinical trial to select LACE characteristics—because the machine learning model must be trained with data, but the algorithm doesn’t need to be invented because the machine does this and it’s incomparably more efficient in the long run.

Clinical Decision Support

All hospitals want to lower readmission rates. While there’s no magical algorithm that eliminates readmissions, it is quite possible to implement a machine learning model that takes a patient’s data and calculates a risk of readmission based on historical data of similar patient types. Assuming the risk score is very high, a physician can determine that it’s an outlier and take the appropriate actions (e.g., review the patient’s record for a missed complication or medication issue). Now, that physician can apply the appropriate treatment and eliminate a potential readmission.

Financial Decision Support

Machine learning has direct applications to financial decision support. One model currently in production is propensity to pay, which takes all patients with outstanding debt and calculates their propensity to pay their bills or their risk of payment default. This allows financial services to avoid the lengthy and expensive process of unsuccessful collection efforts for patients who are determined unable to pay and, instead, flags them for charity care. This machine learning models helps to forecast demand on limited charity care resources. At the same time, the model highlights those who can pay, so financial services can focus collection efforts accordingly.

Machine Learning Misconceptions

The biggest misconception around machine learning is that it’s a cure all. Netflix uses machine learning to recognize patterns and recommend movies, but this doesn’t guarantee the Netflix customer will like a recommended movie or will only ever see great movies. Ultimately, the customer, not the algorithm, still decides what to watch. Machine learning in the hospital is similar. The physician will get a little more information from a model that applies historical data to each situation. But the ultimate decision of how to treat each patient still belongs to the doctor.

The second misconception is that machine learning is here and now. While it can be started today, in reality, most of the implementation is five to 10 years away from a clinical setting. An image processing algorithm is not going to replace a radiologist for reading mammograms any time soon. An article might reveal that a machine learning algorithm outperformed a panel of board-certified radiologists, but there are many logistical obstacles blocking mainstream use in a hospital setting. It can take years to develop the code and algorithm, conduct research, publish results, and then move into actual implementation to the point where radiologists are reading a smaller percentage of images.

The electric car industry is a good example of this timeline. Years after their introduction, there are still only a few models available. None are below $30,000, and they average little more than a 200?-mile range. Ten years from now, there will be many more models, they’ll be very affordable, and they’ll go five times the distance.

Roles and Resources for Machine Learning

The data scientist role is fairly common in finance, insurance, and IT industries, but hospitals are just starting to see this title emerge, especially as machine learning programs develop. Machine learning engineers are also common in the technology sector, but practically non-existent in healthcare. The ideal experience and education for fulfilling these roles comes from those conducting data-centered research in medical school. is a community with education and open source technology tools focused on increasing the national adoption of machine learning in healthcare. The site features healthcare-specific machine learning packages, as well as analysis, commentary, and advice on leveraging machine learning within any health system, regardless of size. The site delivers tools to build practical, accurate predictive models at health systems interested in improving operational, financial, and clinical efficiencies.

Data scientists and machine learning engineers have joined the on broadcasts to talk about risk models they are building. Some are trying to determine why one hospital department is performing lower than another. They are doing risk-adjusted comparisons based on data, and they are building rulesets that couldn’t possibly be designed by humans.

5 Steps to Building a Machine Learning Model

Here are five critical steps to building a machine learning model, some that might seem so obvious, they are easy to overlook:

Step #1: Define the use case. The use case must have a broad impact and be actionable. For example, a readmission risk score is much more valuable while the patient is still in the hospital than after discharge. Predicting a risk score solves part of the problem, but it needs to get into the hands of somebody who can act on it. Define the who, what, how, and why of the problem you are trying to solve with machine learning. Who is going to deliver the intervention? When is it going to be delivered? Rather than jumping ahead to how machine learning will be used, get to the problem first and then figure out how machine learning will be used to solve that problem.

Step #2: Prepare the data. Machine learning models are built on datasets with many different patients and features (age, gender, diagnosis LOS, financial class, health habits, etc.) The more information, the better. Features are presented as columns in a data table; rows are the individual patients or patient encounters. With a good use case and a dataset of 10 features and 5,000 rows (of course, more is better!), it’s possible to predict something.

It’s also necessary to label the dataset. Using the readmission risk prediction example, each row of data needs an outcome associated with it, either yes/readmitted or no/not readmitted. This is called labeling the dataset. When it’s time to train the model, the algorithm learns associations between features and labels. Over the course of learning, the algorithm recognizes patterns, for example, that all patients under a certain age were not readmitted, or that all female patients between 55 and 65 diagnosed with condition x who had a recent LOS of less than two days were readmitted. The learning can become very specific and the algorithm uses the learning to generate a risk score for the next new patient.

Data needs to be arranged and prepared for machine learning. The dataset cannot have missing values. If a row is missing a blood pressure reading, then that feature must be filled in using, for example, the average blood pressure for that patient.

The feature engineering process takes existing variables and transforms them to make them more useful to the model. For example, a calendar date may need a day of the week associated with it to understand what day of the week has the highest rate of Catheter Associated Urinary Tract Infections (CAUTI) in the hospital.

Step #3: Train the model. To do this, set aside a quarter of the labeled data (the testing or validation set) for later use. Use the other three-quarters (the training set) to train the model. Other splits, such as 80 percent/20 percent are also fine.

All algorithms generally work by checking data and looking for patterns. The machine learning algorithm (e.g., ,logistic regression or random forest) learns patterns in the labeled training set to predict outcomes. When a solution is reached, the algorithm is trained. Now, the question is how well it generalizes new data. A model can fit perfectly to the data it was trained on, but then new data is essentially useless because the algorithm is too specific (doesn’t generalize well). The validation dataset is used to evaluate the model for how well it generalizes new data. This defines model accuracy.

During this process, it’s common to try 10 different algorithms, tweaking each one with slightly different parameters to see the variation in performance. Typically, most of the models perform similarly. It comes down to how good the dataset is at predicting the outcome and what the underlying structure of the data is like. An algorithm might do well on some datasets, but poorly on others. Therefore, it’s important to try different algorithms. Typically, a poor dataset will generate poor results from any algorithm. The quality of the data is what really matters for the quality of the machine learning model.

Step #4: Make predictions on new data. This step is more conceptual than technical. Save the approved model so that when new data comes in (a patient is admitted), it can predict a score that’s in line with the training data.

We build our models so that, once trained, they can be put into production. When tables containing all the patient information are refreshed every night, the model runs and gives a prediction for those patients. We’re only limited by how often data refreshes, though we have the potential to generate risk scores in real time, as patients are admitted.

Step #5: Deliver risk scores for use in clinical decision support. Give risk scores to clinicians in visualizations that are easy to interpret and quick to deliver insight and value.

These five steps are an iterative process. Once arriving at the third step, it may be apparent that the model isn’t accurate. This is an opportunity to change the use case so it still addresses valuable questions, while making it easier for the model to answer them. If the desired results are still elusive, it’s possible to add more data (find more patients) or more features.

Who Can Implement Machine Learning?

The package was written by Health Catalyst and is distributed freely to help democratize machine learning. The data infrastructure needs to be in place along with somebody who knows R or Python programming languages and is willing to learn the methods of machine learning models. However, the package makes the learning curve easier for people who don’t know these programs.

Though this might go without saying, the data must be electronic and in sufficient numbers to develop a prediction. We have a model that predicts patient no-shows. With 2,000 rows in the dataset of people cancelling or changing appointments, we can help even a smaller primary care practice with scheduling. Machine learning can be scaled to any size organization as long as it has an appropriately sized dataset.

Machine Learning in Healthcare Starts Now

There are many real-world applications for machine learning right now that aren’t well publicized because, at least in the data science world, they are somewhat mundane. It’s not going to solve every healthcare problem, but it can improve the outcome for a single patient and this is happening today. We will start to see machine learning appear in more clinical trials once the infrastructure has been developed. In the meantime, it’s up to innovative healthcare systems to convey the need for implementing this technology, then prioritize the resources to establish a robust machine learning program that will lead to outcomes improvement.

Additional Reading

Would you like to learn more about this topic? Here are some articles we suggest:

  1. Machine Learning: The Life-Saving Technology That Will Transform Healthcare
  2. The Real-World Benefits of Machine Learning in Healthcare
  3. Health Catalyst® Introduces™: Machine Learning in Healthcare Is Now for Everyone
  4. The Why and How of Machine Learning and AI: an Implementation Guide for Healthcare Leaders
  5. How Healthcare Machine Learning Is Improving Care Management: Ruth’s Story
Loading next article...