These studies were easier to conduct with an already validated test, he says. “All we were doing was modifying that previously validated test to incorporate an AI component. If you were doing this from scratch, you would have to follow all the steps for validating a lab-developed test,” as well as determine whether it meets the definition of software as a medical device. “We have to keep in mind that most flow labs already use research-use-only software from third-party vendors, and some of those companies already have AI-enabled solutions, like clustering algorithms,” he says. “We’ve approached this pipeline in a similar fashion, within the scope of CLIA and a laboratory-developed test.”
Will a human-in-the-loop work-flow always be key?
“The straightforward answer is, it’s easier to validate and deploy a human-in-the-loop workflow right now,” Dr. Seheult says. The human-assisted workflow substantially improved the performance of the initial DNN-enhanced workflow for detecting MRD in CLL. “The human is easily able to clean up false [positive] classifications by the AI model,” he says.
However, the required level of human oversight may change over time if appropriate guardrails are implemented to mitigate risks. “I think in the future we will have workflows that are AI-only, or hybrid, where certain cases are reviewed only by AI and certain cases get the human-in-the-loop workflow,” he says. Although it helps build trust initially, a human-assisted workflow isn’t foolproof. Different people will trust the AI model to different extents, for one thing, and that trust can change over time. “If the model is usually right, humans will be more likely to trust the model in the future,” he says. “And that can be a bad thing because you can start missing drift or shift in model performance over time.”
Variance from person to person should be tracked, he says. “How long are the technologists in our lab taking to analyze cases on average, and how are we seeing that variance change over time?” It’s also important to monitor individual use over time. “At the beginning, when we first deployed the model, were they spending more time verifying the AI inferences, and as time progresses . . . are they now taking less time to verify the AI inferences?”
None of this should discount the importance of the human-assisted approach, especially for a laboratory new to AI. “It builds trust. It builds confidence by the laboratory and end users,” Dr. Seheult says.
Dr. Jeck, for his part, can see a future in which AI acts as an added safety layer for when the human eye is insufficient. He’s in the process of implementing an algorithm at Duke for intestinal metaplasia detection in gastric and esophageal biopsies, to be used for quality assurance purposes. “In the context of QA, we are able to find cases where AI finds things a human missed, and maybe even multiple humans missed. And yet, when we go back and look at it,” the finding is objective. “Ten people can agree what it is, once they’re looking at the right area. But we just didn’t see it.”
In a field with many lingering questions, the matter of reimbursement is a standout.
“Reimbursement is not obvious right now,” Dr. Jeck says. He’s aware of the existing CPT codes for AI. “But they are nonbillable codes. They’re just informational.”
Without an obvious path forward for reimbursement, “as we’re planning out these algorithms, we need them to provide value to patients but remain sustainable for the department and the hospital, independent of the ability to bill for them right now. That’s the mental model we have to hew to,” he says.
Implementing an algorithm in clinical practice requires labor and infrastructure, start-up costs that must be taken on in faith by the institution or an external partner. And eventually the system must pay for itself, he says.
One possibility is that institutions market and sell their homegrown algorithms, either to an AI vendor or through their own formal business ventures.
“That process feeds into an ability to have more resources, to bring on more algorithms,” he says. It doesn’t answer the question of how to cover front-end costs, however.
“That pump has to be primed at some point.”
Charna Albert is CAP TODAY senior editor.