Put another way, the training data has to match the implementation scenario, something Dr. Jeck and his colleagues ran into in their attempt to implement an algorithm that detects glomeruli on kidney donor frozen sections (Li X, et al. J Med Imaging. 2021;8[6]:067501). The algorithm, developed by Laura Barisoni, MD, codirector of the AI division, should have worked well, Dr. Jeck says.
“The training data she used was from one Aperio scanner, and our frozen section lab uses a different Aperio scanner. I thought that in the setting of frozen section, with its incredible number of artifacts and diverse appearances, that that would overwhelm the importance of the particularity of the scanner. It turns out that was completely wrong. The scanner really matters, and the performance of the algorithm took a big hit when we implemented it on a different scanner,” he says.
“These sorts of preanalytical changes—the generation of the slide, the generation of the stain, the generation of the image itself—the AI can be surprisingly sensitive to those things,” he adds.
Pinpointing the preanalytical variables that might affect an AI model is a skill that comes with time, experience, and experimentation, Dr. Seheult says. “The best thing you can do is try to break the model any way you can.” In Mayo Clinic’s cell kinetics laboratory, for instance, an AI-enhanced pipeline trained to identify clonal chronic lymphocytic leukemia (CLL) cells performs poorly when subjected to other tumors, like mantle cell lymphoma. It didn’t take an AI scientist or a machine learning engineer to zero in on the problem, he notes. “It was our lab director who identified these issues.”
Neglecting to consider validation until after development can cause problems, too, Dr. Jeck notes.
“You could imagine getting deep in the weeds and getting in quite a lot of trouble, realizing you’ve designed something that has no way of being properly validated,” he says. “There needs to be some degree of explainability or some kind of clear output that has a gold standard you can compare to.”
Until now, institutions have implemented algorithms in small numbers.
“We haven’t seen the full rollout of a large number of these assays,” Dr. Jeck says. “I think there’s an avalanche of new tests being brought online now by individual centers, and a lot of that has yet to be fully reported on. We’re going to learn a lot in the coming years about how that went, what the pitfalls may be, what the major benefits are.”
At Duke, pathology faculty have worked on AI development projects and produced more than 50 academic papers, but clinical implementation is still a few steps away. “We’re at a crossroads as to how we’re going to make this a reality, in terms of clinical implementation,” Dr. Glass says. “It takes time and resources. It’s either done through personal grant funding—and that requires time for faculty members who are already stretched too thin—or it’s going to require industry partnerships.” The interest is there, she says. “We don’t need to convince anyone of the scientific benefit. We just need to talk to the right people.”

Dr. Glass and the cardiac transplant team, in collaboration with the California Institute of Technology, are working now to develop a clinical-grade algorithm that helps detect antibody-mediated and acute cellular rejection in cardiac transplant endomyocardial biopsies (Glass M, et al. Mod Pathol. 2020;33[suppl 2]:301; Glass M, et al. Cardiovasc Pathol. 2024;72:107646). The algorithm would serve as an adjunct diagnostic tool, rather than for primary diagnosis, for challenging weekend stat digitized cases. They plan to submit a new paper that will focus on AI tools that enable simple, adoptable usage of the adjunct algorithm, Dr. Glass says, and they’re partnering with researchers at other institutions to develop a user-friendly interface for the algorithm.
The algorithm’s benefit will depend on the end user’s level of training, she says. For the general pathologist, it might be difficult to distinguish myocyte damage, the main histologic feature of acute cellular heart transplant rejection. “Sometimes that can be difficult to tell and there might be intrinsic variability,” she says. Cardiac pathologists aren’t likely to need a diagnostic aid, but it’s a highly specialized field with few expert practitioners. An estimated 200 to 300 have been documented worldwide, though the exact number is unknown and undertracked. And although cardiac pathologists aren’t likely to need the diagnostic aid, a tool to streamline efficiency for digitized cases would be helpful.
“It will help the cardiac pathologist if they’re looking at it digitally because it’s going to be more user-friendly,” she says. “It’s going to be an adjunct level for us to look at it and say, ‘That’s where I need to focus my attention.’”
With the AI tool indicating where to focus, Dr. Jeck says, “it’s going to save you from zooming in and dragging through the tissue at high power. Maybe that makes up for the fact that you have to go through what feels like a more clunky or uncomfortable experience. It’s a way to draw people into digital,” he says. “There are people who have been reticent to go digital for whom that might provide the necessary boost to get over the fence.”