Diagnosing Lung Disease with Help from Computers

Joe Hsu, MD (left) and Husham Sharifi, MD, discuss diagnostic techniques using machine learning.


Parts of medicine can be trial and error—if one drug doesn’t work, try another; if a diagnosis isn’t leading to a cure, maybe the diagnosis is wrong. But eliminating that trial and error, through more informed diagnostic tests, saves time for both clinicians and patients. In the division of pulmonary, allergy and critical care medicine, machine learning algorithms are now guiding those more personalized treatment decisions.

“We’re at a critical juncture in pulmonary medicine, where innovative analysis approaches are needed to handle the large number of patient samples and clinical variables we are collecting for research,” says Andrew Sweatt, MD, a clinical assistant professor of pulmonary, allergy, and critical care medicine. “Machine learning is a promising tool that can help us with most of this high-throughput data.”

In machine learning, a computer program sifts through data—whether it’s information on the levels of different molecules in a blood sample or scans of the lungs—and finds otherwise hidden patterns. Often, such programs can do a better job than the human eye at spotting structure in the data, finding correlations between data and patient outcomes, or pinpointing groups of variables that set some patients apart.

“We’re not trying to replace doctors, but with machine learning, there’s a huge potential for augmenting clinical decisions by physicians,” says Husham Sharifi, MD, instructor of pulmonary, allergy, and critical care medicine.

Guiding the Treatment of a Rare Disease

Many patients with pulmonary arterial hypertension (PAH) have other underlying diseases—scleroderma, lupus, cirrhosis, congenital heart disease, or HIV, to name a few. Others have been exposed to drugs or toxins, such as methamphetamine. And in roughly a third to half of patients, the rare lung disease appears without any explanation. In all cases, though, the underlying disease is the same: The small arteries that carry blood through the lungs narrow over time due to structural changes. This progression leads to high blood pressure in the lungs and places strain on the heart.

“It’s a very aggressive disease, and there’s a lot of room to improve patient outcomes,” says Sweatt.

Without treatment, nearly half of all patients die within five years of their diagnosis. Over the past decade, several drugs have been approved to treat PAH. The treatments don’t consistently work in all patients, however, although they all have the same mechanism—to relax and open blood vessels.

A large body of research has suggested that there’s a component of PAH that’s mediated by the immune system, and new drugs are in development to target this inflammation. Sweatt wanted to know whether some patients would be better helped by these new drugs. Until now, PAH has been grouped into subtypes based on the patient’s underlying predisposition, and all subtypes have been treated the same.

Sweatt and his colleagues collected blood samples from 385 PAH patients and measured levels of 48 immune proteins and signaling molecules. Then they let a machine-learning program parse the data set.

“My goal was to remain agnostic by avoiding common pre-conceived notions about the disease, and instead let the molecular data alone tell the story,” says Sweatt.

It worked—the program revealed four previously unknown subtypes of PAH based on the immune profiles of the patients. One-third of the patients studied had minimal inflammation, suggesting that drugs targeting the immune system may not be helpful for them. The three other groups were each distinguished by their unique inflammatory signatures in the blood.

Importantly, the clinical disease severity and risk of death also differed among the four subgroups.

“What really stood out is that these immune phenotypes were completely independent of the cause of PAH,” says Sweatt. In other words, patients who had underlying immune diseases like lupus or scleroderma were just as likely to be in each subcategory of PAH as patients with no underlying disease. “It means we really detected a hidden system for classifying patients that is highly relevant to underlying disease biology and clinical outcomes,” he says.

The data suggest that different types of immune drugs may work against PAH for different patients, but more work is needed to determine whether the new immune subtypes can help guide treatment. Sweatt’s research has been recognized as an innovative first step toward precision medicine in PAH. Building on this foundational work, Sweatt also has additional machine learning–based studies planned to better understand the biological underpinnings and therapy ramifications of each immune subtype.

Narrowing Down a Diagnosis

Another challenge involves graft-versus-host disease of the lungs—also known as bronchiolitis obliterans syndrome (BOS). In that case, the challenge is not differentiating subtypes of patients, but diagnosing them in the first place. Graft-versus-host disease is a complication of a bone marrow or blood stem cell transplant in which the donated bone marrow or stem cells start attacking the body. But BOS can closely resemble other common complications of a transplant, including infections and inflammatory disorders.

“All these types of lung disease are poorly defined,” says Joe Hsu, MD, an assistant professor of pulmonary, allergy, and critical care medicine. “The way we typically diagnose graft-versus-host disease is to look for everything else and, if we don’t find anything else, diagnose that.”

It was seeing things that the eye couldn’t necessarily pick up on and improving the diagnosis

Hsu and Sharifi wanted to do better at diagnosing BOS. They started collecting CT scans from patients with BOS as well as from transplant patients who had similar symptoms but did not have BOS. Then they used a machine learning approach—telling a computer program which cases were which and letting it learn how to differentiate them.

The machine, it turned out, became so good at telling BOS apart from other lung diseases that it was even slightly better than thoracic radiologists, who regularly read CT scans of the chest. The program learned to differentiate normal lung, mild BOS, severe BOS, and alternative diagnoses.

“It was seeing things that the eye couldn’t necessarily pick up on and improving the diagnosis quite a bit,” says Hsu.

Since each diagnosis is treated differently, fast and easy diagnosis is critical. Hsu and Sharifi say in the future, similar programs might be able to differentiate other diagnoses as well, such as chronic obstructive pulmonary disease (COPD). Pulmonology, Sharifi points out, is full of numerical and imaging data that can be leveraged with machine learning.

“For a lot of other aspects of medicine, it’s a bigger challenge to integrate artificial intelligence because clinical notes can be so messy and unstructured,” he says. “But this is a good example of where algorithmic and computational analysis can be used hand in hand with a doctor’s advanced training and experience.”