Newsbytes

in 2024 issues, February 2024, In Every Issue

Editors: Raymond D. Aller, MD, & Dennis Winsten

How labs can make the most of ChatGPT and other LLMs

February 2024—The key to using ChatGPT and other large language models effectively in pathology is understanding not only what they are designed to do but, just as importantly, what they are not designed to do, says Eric Glassy, MD, medical director at Affiliated Pathologists Medical Group, Rancho Dominguez, Calif., and past chair of the CAP Information Technology Leadership Committee.

The models should not be thought of as databases that retrieve facts but as predictors that generate predictions that are often right and sometimes wrong, says Dr. Glassy, who conducted a presentation on ChatGPT and other large language models at CAP23. LLMs are designed to identify answers with the highest probability of satisfying users, he explains. “That’s an answer you would like to have but not necessarily the true answer.”

Incorrect predictions, or hallucinations, generated by LLMs can pose numerous risks to the practice of pathology, Dr. Glassy says. These risks can be linked to models failing to recognize the difference between public and private information, demonstrating racial or ethnic bias, and providing medical information that could potentially be harmful.

While it’s important to verify answers generated by LLMs, pathologist users of the technology can take steps to minimize hallucinations and steer models toward providing more accurate and appropriate answers using offerings such as the following.

Dr. Glassy

GPT-4 versus GPT-3.5. GPT-4, the latest version of OpenAI’s large language models, provides more accurate and coherent answers than its predecessor, GPT-3.5, but it is also more expensive. Yet GPT-4 may be worth the subscription cost because incorrect answers could lead to improper medical treatment. For example, when GPT-3.5 was asked how to treat a pregnant woman who had contracted Lyme disease, it suggested tetracycline, which is effective at treating the disease but can cause a range of developmental abnormalities in a fetus, Dr. Glassy says. GPT-4, on the other hand, correctly identified amoxicillin as the treatment that would effectively and safely treat the disease in a pregnant woman.

GPT-4 Turbo, the latest version of the software, is available for $20 per month. It allows users to expand prompts to approximately 300 pages of text, generate images from a text prompt using DALL-E technology, accept images as inputs, and perform text-to-speech conversion, among other tasks. Those who do not want to pay the subscription fee may want to check out Microsoft’s Bing Chat in creative mode, adds Dr. Glassy. The latter uses GPT-4 and is available at no cost. The Google Bard and Microsoft Copilot artificial intelligence chatbots are also available at no charge.

Prompts. How a query is written can make a big difference in how LLMs respond. The results are noticeable enough that Boston Children’s Hospital hired an artificial intelligence prompt engineer to help physicians and other hospital employees query LLMs more effectively, Dr. Glassy says.

Some prompts improve the accuracy of LLM responses by targeting algorithms. Telling ChatGPT (the application powered by GPT AI models) to provide a step-by-step answer, for example, or asking it to request three questions before providing an answer can guide it toward a more sequence-based approach to processing a query, which tends to reduce hallucinations, Dr. Glassy says. Even asking an LLM to “slow down and take a deep breath” before responding has been shown to result in more deliberate and accurate answers, he adds.

Asking ChatGPT to provide a confidence score for an answer, with zero being not confident and 100 being very confident, can have a similar effect, Dr. Glassy says. However, users should verify all responses, even those with high confidence scores, through other sources.

Other prompt techniques involve narrowing the focus of the question to elicit more specific information. Asking for heart disease symptoms, for example, would generate a wealth of information, but requesting the top five symptoms of heart disease according to the latest medical guidelines would generate a more targeted response, he says.

Instructing ChatGPT to provide an answer from a particular perspective can also guide it toward providing specific, tailored responses. “You can say that you are a pathologist ‘who is an expert in soft-tissue tumors and molecular pathology so walk me through the differential diagnosis between liposarcoma and synovial sarcoma,’” Dr. Glassy says.

Plug-ins. Plug-ins, the specialized programs accessible over the Internet that work in conjunction with ChatGPT, add functionality that increases the model’s effectiveness, Dr. Glassy says. The Wolfram plug-in, from Wolfram Research, for example, performs complex mathematical computations and can leverage Wolfram’s subject-specific databases related to science, technology, and other fields to help ChatGPT arrive at more in-depth answers. Other plug-ins, such as Show Me Diagrams, can make it easier to create a wide variety of diagrams for visualizing complex information through ChatGPT.

A subscription to ChatGPT Plus provides access to hundreds of plug-ins via OpenAI’s plug-in store.

GPTs and other customized chatbots. GPTs, a feature of the ChatGPT Plus service, are sets of customized instructions created by ChatGPT users that function as small applications for performing specific tasks. GPTs were developed because users were maintaining long sets of carefully crafted prompts that they would manually input into ChatGPT every time they used it, according to OpenAI.

Dr. Glassy tested the process of creating a GPT by uploading a CAP synoptic report to ChatGPT and instructing the model to use the report as a template. The process took about 15 minutes and allowed him to put information into standard synoptic report format quickly and easily.

Users can publish their GPTs on OpenAI’s site or they can choose to keep them private. OpenAI launched its GPT Store, “which is similar to the app store for Apple and Google devices,” last month, says Dr. Glassy. “There are now hundreds of free GPTs [available to ChatGPT Plus subscribers], some of which are applicable to medicine and pathology.”

Pathologists can also access multiple proprietary and open-source technologies via the Web to build chatbots that exclusively access information from their institutions rather than a plethora of information from the Internet. Pathologists with a curated collection of 100 hematology papers, for example, could create a chatbot to answer questions based only on information from that collection of papers, Dr. Glassy says.

Pages: 1 2