As AI use expands, ethics at the leading edge

in 2024 issues, ARTICLES, February 2024, Pathology

Anne Paxton

February 2024—Artificial intelligence is sizzling, so much so that New Yorker magazine, evoking the dazzling and the potentially devouring nature of AI technology, tagged 2023 as “The Year A.I. Ate the Internet.” One respondent to a CAP survey of its members on AI called these “exciting but uncertain times.”

Commercially available digital pathology platforms already use AI software, and large language models such as ChatGPT, Bard, and Copilot play a growing role in extending AI beyond direct patient care to training and research.

“But there is a degree of angst and concern among pathologists overall about the utilization of AI, particularly as we move forward into education. And there’s even less knowledge about it in research,” says Suzanne Zein-Eldin Powell, MD, professor of pathology and genomic medicine and director of the anatomic and clinical pathology residency and neuropathology fellowship at Houston Methodist Hospital. She is chair of the CAP’s Ethics and Professionalism Committee.

“It’s unclear to what extent we want to encourage our trainees to use AI now,” says Neil Anderson, MD, D(ABMM), director of the anatomic and clinical pathology residency program at Washington University School of Medicine in St. Louis and a member of the CAP Ethics and Professionalism Committee.

“Some people are very much against it and don’t feel comfortable at all,” he continues, “whereas others see AI tools for their potential and want us to be using them in a controlled manner.”

Fears that AI could become impossible or difficult to control are justified only if humans do not impose controls when AI systems are built, says Brian R. Jackson, MD, medical director for business development at ARUP Laboratories and a member of the CAP Artificial Intelligence Committee.

“As soon as we start talking about AI as being too big to handle or too risky, then what we are saying is that we need to put more effort into the control side of it and that maybe we’ve gone too far on the autonomy side,” he says. “But if pathologists are in the driver’s seat and AI is developed in ways that empower and let pathologists do a better job with their existing work, then I think we’ll be on a good trajectory.”

Drs. Jackson, Anderson, and Powell led a CAP23 course on AI and ethics and spoke with CAP TODAY recently.

The CAP in August 2023 sent an online survey to a random sample of CAP fellows with five to 30 years in practice, all House of Delegates members, and all members of the CAP Engaged Leadership Network (graduates of the CAP’s Engaged Leadership Academy course).

The aim was to assess familiarity with and the use of AI diagnostic tools, AI policies and guidelines in place within the laboratory and hospitals/organizations, and AI in training, and to reveal concerns about or views of AI.

The CAP had 152 responses to its survey (2,043 distributed, 7.4 percent response rate), which revealed the following: Twenty percent of academic and nonacademic hospital-based respondents said AI diagnostic tools are being used in their practices or laboratories. Fifteen percent reported they have validated at least some AI diagnostic tools, with academic hospital-based respondents more likely to have validated AI tools (19 percent) than nonacademic hospital-based respondents (13 percent).

Forty-six percent reported their hospital/organization has a policy or process to identify when informed consent is necessary, but 43 percent were unsure whether guidelines or policies are in place to govern or guide the use of AI. Fifty-nine percent were unsure whether their hospital/organization has a policy for data sharing with commercial AI developers.

Many reported they were unsure whether AI was used in their training programs, though 16 percent reported their trainees are using AI tools for presentations and clinical reports, among other things. Sixteen percent indicated their training program offers education on the appropriate use of AI tools. Very few reported that their hospital/organization provides guidance on how to cite the use of AI in manuscript/data preparation. Few reported there is a policy to govern/guide the use of AI in their training program.

Dr. Anderson

That use of AI in medicine is so new explains some of the survey respondents’ uncertainty about policies, says Dr. Anderson, who is associate professor in the Washington University Department of Pathology and Immunology. “One of the important things that came out is that a lot of people answered ‘I don’t know’ to a lot of questions. That says that many people are learning about things like ChatGPT and AI from the media and their friends rather than talking with their colleagues about how it can be used. Everyone should know whether or not they have a policy regarding AI, and if they don’t they might consider drafting one.” Dr. Anderson says guidance, recommendations, and protocols are needed for validating and verifying AI tools, “from informed experts who understand both the laboratory medicine side and the technology.”

AI has great promise as a check on plagiarism, he says, either by trainees or researchers. “Maybe it’s okay for AI to generate the outline and you fill in the pieces. One could argue that you’re still ultimately responsible for what comes out of the AI models. So it’s kind of like using Google or Wikipedia to inform your presentation.” On the other hand, he adds, “It’s not as clear to the average layperson what is going into these models. And if you’re using them to generate patient notes or presentations you will use to teach others, is that material even going to be correct? That comes back to someone needing to vet it. If we bypass that part of our training of our residents, we could have a real issue.”

That points to the need for more standardized validation protocols for AI tools. In non-AI–based routine testing, “we have a very specific playbook we have to follow when validating a new test, and I don’t necessarily know how well developed that is for AI tools at this point because they’re all new and cutting edge,” he says.

If the data going into the model are not understood completely, we may not understand if the model is working appropriately, Dr. Anderson says. “And if we’re using a test to make these higher-order decisions, you want to have a handle on that. For instance, if a model is based off of many different laboratory values that feed into it, what happens when you change one of those tests and don’t consider its impact on the model? With QA and QC in the laboratory, you’re making sure a test still works. You need to have those same checks and balances with AI.”

He cautions too about the need to prevent bias, saying that certain tools, depending on how they’re built, might be susceptible. “If the tool is constantly normalizing your data and always giving you answers based on what is most often correct, you’re creating a bias. And when you get something that doesn’t necessarily fit your model, then it might be inaccurate. If I tried to design some sort of AI model based on one patient subtype, it may or may not work in patients from a different demographic or a different geography.”

Residents and fellows need to have a basic understanding of AI tools that are built on laboratory output data, Dr. Anderson says.

There needs to be regulatory oversight also, by people who understand not only the information science behind it but also the laboratory medicine science and clinical science behind it. “Where we run into trouble is when we have people who may have only one type of expertise and not the other types vetting these tools and seeing whether they’re ready for prime time or not,” Dr. Anderson says. He suspects that the technology at this time is outpacing the regulation. “That’s not altogether surprising, and the regulation will catch up, but that’s where we are right now.”

The Food and Drug Administration has a role to play in regulating AI for use in medical care, Dr. Jackson notes—at least in premarket evaluation.

FDA mechanisms for postmarket surveillance and enforcement are weak, he says. He hopes the FDA is developing mechanisms to do a better job overall in evaluating AI that’s embedded in medical devices.

Wu, et al., analyzed 130 of the medical AI devices approved by the FDA between January 2015 and December 2020, using summary documents of each approved device (Wu E, et al. Nat Med. 2021;27[4]:582–584). Almost all of the AI devices (126) underwent only retrospective studies at their submission, based on the FDA summaries. None of the 54 high-risk devices were evaluated by prospective studies. Of the 130 devices, 93 did not have publicly reported multisite assessment included as part of the evaluation study. Only 17 device studies reported that demographic subgroup performance was considered in their evaluations.

Dr. Jackson calls this “not reassuring” in terms of the FDA validating AI for pathology.

Pages: 1 2