How Machine Learning Can Transform Healthcare

Growing up, Taisa Kushner loved math. There was a functional elegance to the way mathematical techniques could be used to unravel even the most complex problems, and how seemingly disparate systems were linked through simple equations. But she enjoyed the biological sciences too, and wanted to find a career where she could use her skills to help people in a tangible way. 

“I was pretty set on medical school, but then I took Dr. Rebecca Sanft’s course in mathematical biology at St. Olaf College – that was career-changing,” Kushner said. “I had no idea you could use math to help people. It was such a cool niche!”

Kushner went on to earn her bachelor’s degree in math, mathematical biology, and art, followed by a PhD in computer science and interdisciplinary quantitative biology, with a master’s in applied mathematics along the way. Now a Principal Researcher at Galois, her work focuses on harnessing the combined power of machine learning (ML), formal methods, and cyber-physical systems to address some of the most difficult (and fascinating) challenges in healthcare and beyond.

“There are so many hard, open problems, and there aren’t many people working at this intersection of biology, machine learning, and formal methods,” said Kushner. “If you change something in this space, especially in how we think about interoperability and safety of medical devices, it can have a huge impact.”

The Potential of Personalized Healthcare 

Machine learning (ML) excels at parsing through and uncovering patterns in datasets so large or complex that human brains can’t make sense of them on their own. At Galois, our Artificial Intelligence, Data Science, and Machine Learning team (affectionately known as the “DISMAL” Team) has built ML tools to model non-linear systems, expose financial corruption, analyze satellite images to boost port security, and even uncover illicit fishing networks to protect marine ecosystems. Elsewhere, in the world of digital marketing, ML models are infamously used to evaluate browsing patterns, purchasing history, and social media engagement to curate hyper-personalized ads with uncanny relevance to an individual’s interests and desires. Yet that power to analyze disparate data to pinpoint individualized solutions has far more interesting applications than simply garnering clicks and making sales. 

“Data science and machine learning is very good at feeding people personalized ads,” Kushner said. “Now people want to use machine learning to make personalized healthcare devices, but it’s not a simple translation.”

Machine learning has massive potential in this space because human bodies are all unique ecosystems – no two are exactly the same. Genetics, environment, lifestyle, diet, and a myriad of other factors contribute to the unique physiology, microbiota, and biological processes of each person. This variation can affect how individuals respond to treatments, medications, and interventions. While generalized, “one-size-fits-all” solutions work well in most scenarios, more complex challenges often require—or could benefit from—more personalized treatments, tailored to the distinct characteristics and needs of the individual to maximize efficacy. 

Yet accurately diagnosing and understanding the intricacies of an individual human biome is no small feat. It requires high-precision diagnostics, comprehensive data collection, and robust analytical tools to interpret complex biological information. Machine learning, if done correctly, can offer a potential key to these efforts – empowering doctors and researchers to make sense of the vast oceans of data and constantly changing variables at work in a human body, and create application-specific solutions for health conditions that cannot be adequately managed with a one-size-fits-all approach. 

Significant Challenges Require Novel Machine Learning Approaches 

The key challenge at hand is that, to effectively parse out patterns, ML techniques require not just a lot of data, but a lot of the right data.

“In healthcare, explainability and trustworthiness are paramount, but the data distributions are insanely biased,” said Kushner. “This is a space where classic machine learning techniques fail.” 

We have plenty of healthcare data from wearable devices, but most reflect very benign situations – healthy people just living their lives. Conversely, researchers tend to have very few, often even zero, data points from high-risk scenarios that lead to severe harm (think: impending heart attacks, diabetic hypoglycemia, etc).

This isn’t a problem that can be solved by simply gathering more data. From a machine learning perspective, the most useful data would be that showing what happens in a medical emergency without intervention. But since medical professionals will do everything they can to save people from death, we not only don’t have that data, and in most cases, we will never be able to get it (at least not ethically). 

“We’ve shown time and again that this results in ML models that can show great performance on test data, but when you dig deeper you find that performance is skewed,” Kushner explained. “There’s excellent predictability in benign scenarios, but accuracy for high-risk scenarios drops dramatically.”  

That doesn’t mean that ML is rendered useless for healthcare, it just means the research community needs to develop novel techniques that work specifically where classical methods are failing. If we can do this—and Galois and others are beginning to show that we can—we can leverage the power of ML to go from generalized models of human physiology and treatments, towards personalized approaches. 

ML in Action: Better Automated Insulin Dosing

Several years ago, as part of her doctoral research, Kushner helped design a range of algorithms for personalized medical devices that would automatically administer insulin doses to individuals with Type 1 diabetes (T1d). Broadly, these devices are called “artificial pancreas” (AP) systems and they consist of a continuous glucose monitor, an insulin pump, and an algorithm connecting the two. The algorithms that Kushner and her colleagues developed were not the first such AP system,  but they were the first to improve how these devices are personalized for individuals, and the first to show how novel, explainable and trustworthy ML can be utilized to make personalized algorithms built on data the device is already collecting.  

While most people without diabetes can largely eat and exercise whenever they want, individuals living with T1d (and insulin-dependent type-2 diabetes) must carefully balance insulin dosing with their diet and activity levels to control blood sugar levels and ensure their cells are getting the energy they need to live. For these individuals, estimating future blood sugar levels and preemptively adjusting insulin doses presents a constant logistics challenge. It takes over a person’s life. 

“Today, even the most rapid-acting insulins take about 90 minutes to reach full effect, and can keep affecting your blood glucose levels for up to six hours, so you really have to plan,” said Kushner. “We don’t yet have a doseable version of the ‘inverse’ of insulin, called glucagon (though research on this is ongoing), so for the AP algorithm we have what’s called a one-sided control problem. We can dose insulin, but we have no brakes. There’s no way to course correct a dose that was too high, a meal that didn’t happen, or exercise that we didn’t anticipate. As a result, a lot of algorithms veer towards being less aggressive. But we’ve seen this backfire: if people are dissatisfied with how well the system works, even if it’s to keep them safe, they will learn to ‘hack’ the algorithm to give them more insulin. For example, they might tell the system that they ate when they did not. Those sorts of ‘human factors’ render the safety measures that developers built in useless.”

T1d management is particularly hard for kids, who are notoriously terrible at pre-planning meals and activities with hours of foresight. In addition, their frequent hormonal changes, highly variable activity patterns, and quickly developing bodies make it difficult to predict proper insulin dosing. Seemingly everything, including the moon, can change how sensitive a person is to insulin at a given point in time. A perfect insulin dose for an apple today might be way too high or way too low for that same apple tomorrow. Plus, because children typically require smaller amounts of insulin than adults, and more precise calculations are required to avoid overdosing or underdosing. All of this grows particularly burdensome at night, when symptoms are harder to catch: parents of children with T1D are often forced to check on their kids up throughout the night, waking and feeding them snacks to avoid low blood sugar.

“That’s a huge stressor,” said Kushner. “It’s hard enough for individuals to plan all this for themselves or their child, but infinitely more difficult to construct a single algorithm that predicts this across a wide range of potential users.” 

In response to this problem, Kushner and her colleagues developed novel personalized algorithms that power AP systems by integrating current and past glucose and insulin levels to calculate how much insulin people will need over the next six hours — insulin dosed automatically, as needed, in the right quantity. 

By analyzing trends over time, the device could predict future glucose rises, respond to changing bodies, and adjust its preemptive dosing algorithm accordingly. Unlike prior systems which utilized general models of human physiology (which don’t match any specific individual that well) their approaches were personalized. And unlike traditional ML which can misassociate patterns with incorrect causal relationships, they bridged ML and formal methods to enable explainability and trust. The end result: reduced mental load and stress, either on the actual person with diabetes or the parent of a diabetic child.

“We did clinical studies with kids, and you get these eight-year-olds who tell you: That’s the first time I’ve slept through the night in my whole life!”

Bringing in Formal Methods

Bringing formal methods into ML was key for the success of these projects. As noted previously, ML is great at picking up patterns in big data sets, but in healthcare is plagued with biased data. Furthermore, ML algorithms are prone to mixing up correlation and causation—drawing the wrong conclusions from the data, and making high consequence mistakes as a result. 

“Our research showed that lots of these proposed models (across literature) are picking up the wrong correlations,” Kushner explained. “They would find the right patterns, but would make mistakes like attributing a blood glucose rise to a big dose of insulin rather than a meal.”

Because all the model sees is data, such as: “A person takes insulin, then blood sugar rises,” it can incorrectly conclude that insulin raises blood sugar. Working from this logic, an automatic doser could feasibly respond to a drop in blood sugar by administering a huge dose of insulin – a mistake which could kill a person with Type 1 diabetes. 

Most medical device developers rely on standard testing methods to try and eliminate these sorts of issues. This imperfect approach can have consequences, not only to patient safety, but to device manufacturers’ bottom lines. 

In one account, a promising medical device had to halt development in the middle of Phase 3 clinical trials because an error emerged that they hadn’t thought to test for. Countless other examples chronicle costly recalls of faulty medical devices, already on the market, that could have been prevented by adequate testing with formal methods. That kind of miss can cost millions of dollars and years of invaluable production time. 

Formal verification—one of Galois’ hallmark areas of expertise—offers a better way forward.

‘”A lot of my work involves using formal methods to solve those sorts of correlation-causation problems,” said Kushner. “How do you figure out what the model is doing? What are the relevant properties? How do I test it in infinite scenarios rather than just a bunch of points?” 

When doing formal verification for code, computer scientists define and test for specifications—properties that have to hold in the implementation. When designing cyber-physical systems (computers that interact with the real world), such as medical devices, scientists can similarly use formal methods to define and test for specifications that have to hold. For example, they may codify critical details like chains of causation, for example: what variable shifts should or should not trigger an automated insulin dose. Formal methods allow us to identify potentially error-prone chains across all aspects of a device – algorithms (in theory and software implementation), hardware, and human users – without having to define these chains apriori. This process enables developers to mathematically prove the correctness of their code and system design, uncovering errors early in the design phase, rather than after prototypes have already been built, and thereby avoiding costly mistakes. Many of the same verification tools and techniques used for verifying medical devices have been used successfully in different, though equally high stakes, sectors—including designing navigation systems for self-driving cars and safety features for military aircraft.

“It’s a more thorough way of evaluating,” said Kushner. 

The Way Forward

When it comes to applying machine learning to healthcare, we’ve just begun to scratch the surface. The ability of machine learning techniques to make sense of complex data has many more potential applications, from helping develop personalized interventions for multiple sclerosis patients to using ML-systems as collaborative teammates in high-risk medical scenarios, such as providing real-time image analysis for robotic surgery. 

“Integrating predictive models with imaging could give a surgeon information about which cells are cancerous and which cells are not, beyond what a single point-in-time image can relay” Kushner said, describing an application she has already begun exploring. “More importantly, this kind of system could communicate confidence levels, like: ‘I’m 20% confident that this area is cancerous, but in this other area I’m 99% confident.’ If I’m a surgeon getting that kind of real-time feedback, I know to rely on myself rather than the machine for different tasks. That’s human-machine teaming.”  

Galois’s expertise in data science and machine learning, combined with our expertise in formal methods, sets us up to bridge the gap in ML applications for healthcare challenges. We can create ML/AI-powered tools that can empower medical professionals to make better decisions based on real-time analysis of vast data sets, but we can do so with mathematical assurance that the conclusions ML models are reaching are accurate–connecting dots that indicate true causation rather than deceptive correlation. 

Together, machine learning and formal methods have the potential to solve some of our most pressing medical challenges – and Galois sits right at that intersection.