Home >> ALL ISSUES >> 2023 Issues >> Test adds twists to lung disease diagnosis

Test adds twists to lung disease diagnosis

image_pdfCreate PDF

Dr. Larsen was already familiar with the test, having been involved in its original validation.

He and his Mayo Clinic colleagues do not use Envisia, though he understands the appeal of the test from a clinical as well as a patient perspective. “It’s a very attractive test,” given that it avoids the risks of a lung biopsy in patients with compromised respiratory function. “When we decided whether or not this was any value to the clinical team and whether we should be doing this, originally our clinicians saw the potential advantages, but were not as familiar with the limitations and problems with the test.”

Education filled in the gaps. “I think our clinicians benefited from pathologists who had intimate knowledge about the classifier—what it is, what it answers, what it doesn’t answer, and what its limitations are,” he says.

Dr. Larsen is concerned that merely identifying UIP does not provide an answer about the underlying etiology. And though the classifier isn’t validated to discriminate among the various types of causes that can lead to UIP scarring in the lung, he says, “It’s being used that way” in some practices.

Dr. Larsen

Dr. Larsen also notes that directing tissue to the Envisia test means other valuable information might disappear. “You lose a lot of nuance that’s present in the biopsy.” A pathologist might see a UIP pattern of fibrosis, but also, say, lymphoid hyperplasia suggestive of an autoimmune disorder, or granulomas that suggest hypersensitivity pneumonitis.

The transbronchial biopsy required by the classifier isn’t a shoo-in, either. It’s less risky than a wedge biopsy but doesn’t eliminate risk. “What it does eliminate is a pathology assessment from the process.”

Moreover, he has questions about what the classifier’s impact is in the real world (Chaudhary S, et al. Eur Respir J. 2023;61[4]:​2201245).

In his opinion—he makes it clear this is indeed his own personal opinion—the classifier is “an oversimplified and relatively crude approximation of reality, basically designed to detect advanced scarring. Which is nothing beyond what a good, high-resolution CT scan is able to detect.”

Not that anything in the field is simple, Dr. Larsen is quick to acknowledge. “Lung scarring is a difficult nut to crack,” he says.

“It can be frustrating for clinicians when they send a biopsy, and the pathologist comes back with a wishy-washy answer. I imagine that’s frustrating for patients as well.”

Dr. Myers

Dr. Myers, of Michigan Medicine, calls the field “a long-evolving discussion. People really struggle with diagnosing these conditions.” Over the years multidisciplinary discussion, with histology as just one of multiple ingredients, has essentially become the gold standard for diagnosis in diffuse parenchymal lung diseases, he says, codified in published clinical and diagnostic guidelines.

This comes with pluses and minuses, Dr. Myers observes, “but it has caused confusion about the role of histology” and can even lead to histology being undervalued at times in those discussions. (“Of course, as a pathologist, I’m biased,” he adds with a laugh.)

Dr. Groshong agrees this has long been treacherous terrain, with neither the radiology nor the biopsy being entirely diagnostic or specific. That has led to years of weekly interstitial lung disease conferences in which troublesome cases were diagnosed by consensus as much as anything. “A lot of these cases end up being kind of a preponderance of data,” he says.

From that perspective, he continues, “I think the Envisia classifier can tip the balance one way or the other, but isn’t the sole decider. We talk about it in the ILD conference along with all the other data,” including results from radiology and pathology. “But the reality is, the transbronchs almost never show UIP.” UIP tends to be a very peripheral predominant fibrotic process, near the pleura, he explains, while the transbronchial biopsy is done in the more central part of the lung. “So you’re almost biopsying the wrong region.”

One common scenario: a nonconfirming biopsy (“It doesn’t show the fibrosis, but everyone knows it’s there,” Dr. Groshong says), coupled with an imaging result that is nonclassic for UIP. For clinicians who are otherwise unable to find anything in the patient’s clinical history to sway them in a different direction, an Envisia classifier that reports UIP will be taken as weak evidence that the case is more likely UIP and IPF, says Dr. Groshong. “And if you don’t have a wedge biopsy, it’s better than nothing.”

Dr. Larsen, for his part, understands the appeal of the Envisia in practices that see very few interstitial lung disease biopsies. “I wouldn’t want to deal with it if I were out in community practice and saw something once a year. I wouldn’t feel confident whatsoever.” The appeal of simply putting a sample in a FedEx box and letting an expert figure it out is understandable, he says. “Problem solved, right?”

And based on what he’s seeing at Mayo—or, rather, not seeing, given the aforementioned drop in biopsies—that’s exactly what’s happening.

That’s fine if clinicians are using it in the clinical context for which it is designed. “But it may or may not be providing the information they think it’s providing them.” That, Dr. Larsen says, is an argument for pathologists to understand their enduring role in educating others about testing. “A pathologist doesn’t need to understand fibrotic lung disease to understand the limitations of an assay,” he says, or to understand what a biopsy can reveal that a test like that classifier can’t.

Given the fuzziness in the field, pathologists will likely face continued questions from colleagues who are Envisia-curious.

Though Dr. Groshong purposely didn’t delve into those conversations at National Jewish Health, he’s well acquainted with the ongoing questions colleagues have about the test. One regular question: How does it work?

“They want to know what genes they’re looking at,” he says. The question is nearly impossible to answer. This is a black box algorithm, he says, not a hand-created algorithm looking at individual gene transcription.

For clinicians, Dr. Groshong says, “That can be a hurdle—they’re not always comfortable trusting something that’s not explainable: What genes are they looking at? Why is this working?”

He’s less bothered by the black box approach, citing his experience in using programming, machine learning, and AI to work with images. “I’m comfortable with this notion that things can work even though you can’t necessarily interrogate network weights and figure out how they’re working.”

Others will need to become more comfortable with this shift in strategy, too, he suggests. “We’re going to be seeing more of these tests, and not just in ILD, because machine learning is so good at making predictions off of data sets,” including those that “you can’t even imagine contain” the desired data. On large networks, by the time they’re trained, it’s no longer possible to look through the millions of network weights and figure out what corresponds to what, he says, or how it’s being calculated.

Again, this doesn’t particularly disturb Dr. Groshong. “In the end, all you can do is give it thousands of samples and show that its accuracy meets a certain expectation.” That should have a familiar ring to it. “In reality, that’s what we do in medicine all the time. That’s how we validate lab tests. Even in pathology itself, our opinions are kind of subjective.”

Some pulmonologists also ask whether the classifier can replace the wedge biopsy, Dr. Groshong reports.

“The answer is no. The wedge biopsy is always better if you can do it, but this is a good option if you can’t for some reason.” These questions are more likely to come from practitioners who have less experience with ILD and who think the classifier will “save” their patient from a biopsy.

“That’s the wrong way to think about it,” Dr. Groshong posits, since the biopsy will provide a better answer. The classifier is a second-line diagnostic tool, he suggests, if the patient is too old or frail to tolerate a wedge biopsy. “That’s probably one of the most common misunderstandings in the community,” he says. “A lot of pulmonologists like this idea that it’s a fairly noninvasive way of getting an answer. But it doesn’t give you the same quality answer as does the large biopsy.”

“Some people view the test as a magic bullet and then want to use it on everything,” Dr. Groshong adds. “That would be a mistake. I wouldn’t not do a wedge biopsy on a patient just because I can do the Envisia instead.”

Dr. Larsen too calls for pathologists to educate clinicians about the role of the test. Discovering a UIP pattern is, obviously, helpful. But he considers the lack of context to be the test’s major Achilles’ heel. “It’s fine if clinicians understand that limitation, and they might still find it useful. But the devil is in the details, as always,” says Dr. Larsen. “And clinicians look to us to provide that information.”

When they don’t, trouble ensues. Like Dr. Borczuk, Dr. Larsen compares the situation to liquid biopsies that bypass pathologic assessment. “We know that pathologists aren’t perfect, and there’s always this desire to have the more perfect test that’s not biased by human opinion. We think that a result is going to be more precise if it’s generated from an instrument that eliminates the human from the process.” If only biology were that reductive, he sighs.

Lung scarring and lung fibrosis are complex, he continues. “You can’t reduce it to a binary result.” In other words, it’s not like a pocket Constitution a politician waves about as a simple explanation for how government works. “In the long run, it doesn’t do patients any favors” to avoid the heavy lifting of making a detailed diagnosis, Dr. Larsen says.

In his view, “We’re at a point where we can finally start understanding some of these problems at a deeper level, with the evolution in genomics and molecular testing. We can finally solve some of the mysteries that exist, that answer some of the remaining questions about these diseases in ways we never could have a couple decades ago.”

In that sense, Dr. Larsen continues, the step forward represented by a test like Envisia is terrible timing. “We’re our own worst enemies, because now we are eliminating humans from the process of evaluating these really complex disorders. We can make judgments that the genomic test can’t, and then put it into the larger context.”

Dr. Groshong identifies another bit of ironic timing. Two relatively new drugs, nintedanib and pirfenidone, can be used to treat progressive fibrotic diseases with a UIP pattern—knowing the underlying etiology may not be as critical as once thought, he says.

Indeed, he continues, before those drugs became available there may have been less pressure to make a diagnosis of UIP. “But now there’s this treatment dividing line: If it’s idiopathic UIP, then I’ve got two drugs; if it’s not UIP, I might be stuck.”

Dr. Myers agrees. When transplant was the only viable option, an accurate diagnosis mattered less. Even now, he says, the available drugs aren’t a cure, and they come with terrible side effects. “But at least it’s one of the first times this conversation felt important, because there are differences in what you can do for patients.”

And surgical biopsies have their own problems. Dr. Borczuk adds his own historical context, noting that as biopsies have been performed less often, and as tissue samples have become smaller—part of the ongoing story of trying to do more with less—some problems have become a self-fulfilling prophecy. “Not to say that we should biopsy patients simply to train pathologists, but this is one of the consequences of getting smaller tissue samples.” And when the cases are more challenging, “There are fewer and fewer people who have that experience to properly analyze it.”

Dr. Larsen identifies other problems with surgical lung biopsies. Once the tissue lands in the hands of the pathologist, he says, “One of the challenges we continue to grapple with in lung pathology is inconsistency and lack of criteria that would enable people to use a more confident, informed diagnosis.” Terminology is some 50 years old, he says, and despite some evolution, “We continue to use rather crude diagnoses for highly complex problems.”

Little wonder, he continues, that clinicians are eager for new tests, including those that, in practice, bypass pathology. A molecular classifier that appears to be an adequate surrogate is doubtless appealing. “It is a reflection of the struggles we continue to have in our field—to leverage these biopsies to be more clinically valuable, to learn more from them and provide more sophisticated diagnostic opinions.

“And we’re just not very good at that,” he says. Inter- and intraobserver variability is high even among experts, he says. “We don’t agree with ourselves on where those thresholds should lie, and we don’t have very good data from studies to refine the diagnostic criteria we use.”

There’s no shortage of opinions about Envisia, and it’s likely the tale will continue to be told for some time, in many voices, medicine’s version of a Viking saga.

“This is a glimpse of the future. Absolutely,” says Dr. Myers. “As machine learning and artificial intelligence become more and more powerful, I think this is the first of multiple biomarkers to come that might eventually allow them to make these diagnoses without any sort of biopsy. This is going to become more common.”

But, Dr. Myers continues, the real need is for better strategies on the therapeutic side, not the diagnostic side. As with many diseases, he says, “The issue is not so much precision as it is having an empty toolbox when it comes to knowing what to do about them.”

Dr. Groshong predicts this is only the first attempt to produce a test in this area. “We haven’t really had an ILD-specific lab test ever.” Because Envisia has stepped somewhat successfully into this space, there will likely be others.

Because the two drugs are the first ever approved for IPF, he adds, and because they’re so expensive, interest in the Envisia test was almost a fait accompli. Most payers would like to see a companion diagnostic for six-figure-type drugs, or at least some sort of diagnostic test that will suggest a patient is more likely to have the diagnosis in question.

He’d like to see more expansive classifiers, ones that could also identify, say, nonspecific interstitial pneumonia and hypersensitivity pneumonitis. A non-UIP diagnosis leaves physicians plenty of room for head-scratching, so “having classifiers that could flesh out that level would be helpful,” he says, though he recognizes it would be hard to accrue enough patients for such studies.

“And then the ideal would be to get away from transbronch altogether,” Dr. Groshong muses, eyeing the possibility of blood or sputum samples.

Dr. Lakticova holds out hope for a method that would allow genomic testing on material obtained by brushing the bronchial wall, for example. “Maybe we can progress eventually to a nasal swab. Maybe we are slowly transitioning from histology to genomics.”

Dr. Borczuk would like pulmonology and radiology algorithms to more clearly incorporate when pathology is beneficial and to elucidate the expectations for particular biopsies. When biopsies are used in the wrong setting, “Of course clinicians get frustrated.”

Dr. Larsen suggests the Envisia test represents a shortcut in a field that doesn’t need one, though he completely understands the appeal of a quick fix, a sexy, binary test that everyone wants to use. “We all learn the hard way. And then our enthusiasm is tempered.”

Long term, he’d like to see not only improved diagnostic criteria but also companion diagnostic markers. “The field is desperate for some kind of meaningful biomarkers that can predict response to therapy,” including steroids and antifibrotic therapy. “We make a number of assumptions, but we don’t have data yet.” In his view, tissue-based tools will be key.

How hopeful is he this will happen? Dr. Larsen reports that on the pulmonary pathologist society level, there’s been growing interest in developing consensus criteria, which in turn could be used as a basis for better studies. At the very least, he says, “We have to try to move in the right direction.” Lung fibrosis pathology can learn much from advances in other fields, he adds, including neoplasia classification and workup.

He is open to the idea that the Envisia test—even as it irks him—might represent a step toward thinking differently about these long-standing problems. “Everyone agrees we need better tools, and I think the Envisia classifier is an inevitable result of our molecular revolution and a consequence of a lack of progress in pulmonary pathology for many decades.” He adds: “Maybe it’s useful in the sense that it’s making us think of different ways of arriving at meaningful information.” If there is indeed genomic information that can point to underlying disease processes or likely response to therapy, “That would be hugely transformative,” Dr. Larsen says.

Karen Titus is CAP TODAY contributing editor and co-managing editor.

CAP TODAY
X