Newsbytes

CAP working group targets machine-learning issues

July 2022—If a machine-learning algorithm is trained to help detect cancer in whole slide images at one health care location, shouldn’t the same algorithm work on digital slides from a similar patient population at another site?

“We expect the answer to be yes, but there is a lot of evidence that the answer is no,” says Michelle Stram, MD, clinical assistant professor, Department of Forensic Medicine, New York University.

Generalizability in machine learning, which refers to the ability of a model to make accurate predictions using various data sources that are not included in the model’s training data set, can have important implications in pathology. This is especially true in whole slide imaging because multiple factors can negatively affect the performance of a machine-learning algorithm, says Dr. Stram. Understanding generalizability and the variables in pathology that can affect it is the focus of a project undertaken by the CAP Machine Learning Working Group, of which Dr. Stram is a member.

“Pathologists are able to recognize cancer in one slide versus another slide, and if the preparation looks a little different, we overlook that because it’s not important to the diagnosis,” she says. But with whole slide imaging, the variables that could impact how an algorithm performs include not only differences in how a slide was prepared but also variability in the color profile, contrast, and brightness that could result from using different scanners.

Dr. Stram

“One of the problems is that even when you are a large hospital network, you still have a limited number of sites and labs, and you don’t really get an idea of how much variability there is,” Dr. Stram says.

That’s where the CAP Machine Learning Working Group comes in. The group is using data the College collected from medical sites worldwide, through its various programs, to develop a broad perspective on the factors that can affect a machine-learning algorithm’s ability to generalize information. The working group has also reached out to the HistoQIP Whole Slide Image Quality Improvement Program, a joint undertaking of the CAP and National Society for Histotechnology, to obtain data for assessing machine-learning generalizability. The data are generated from whole slide images from different scanners and slides prepared at a variety of histology labs, Dr. Stram says.

The HQWSI program provides feedback to laboratories that use whole slide imaging for clinical applications. More specifically, an expert panel of pathologists, histotechnicians, and histotechnologists identifies issues in digital whole slide images and corresponding glass slides that are submitted. These issues range from defects in histology preparation, such as knife marks or folds in the tissue, to image imperfections introduced in the scanning process, like blurry patches or incorrect color tones, Dr. Stram says. The program also collects background demographic data from laboratories that submit slides, including the type of hospital from which the specimen was obtained, types of pathology services offered, and even details about the stains used, such as whether H&E staining was automated or performed manually.

The relevance of this is not lost on Matthew Hanna, MD, director of digital pathology informatics at Memorial Sloan Kettering Cancer Center, who leads the CAP Machine Learning Working Group. Dr. Hanna and several of his colleagues at Memorial Sloan Kettering published a paper in Nature Medicine that addressed, as one of its subgoals, the generalizability of machine-learning algorithms in pathology (doi.org/10.1038/s41591-019-0508-1). The study found that a model trained on an uncurated data set of slides with various interlaboratory variations outperformed a model trained on a curated data set of slides that did not generalize well to slides prepared at other institutions or with interlaboratory variation.

“That suggested that the scanner you are using and the histology preparation—even though they may look the same to us—may not look the same to the algorithm,” Dr. Stram says.

To investigate this further, the working group has procured and begun examining 568 slide images from the HQWSI program. The group has also received the complete list of questions that the HQWSI asks laboratories about their institutions and how their slides are prepared. (It is awaiting answers to these questions.)

During the pilot stage of its investigation, the working group plans to study digital prostate cancer slides, as it wants to focus on a smaller subset of images and they “would be the most interesting to the most people,” Dr. Stram says. She estimates that of the 568 slides the working group had obtained as of CAP TODAY press time, between 20 and 40 were prostate cancer biopsies.

The first step of the project, Dr. Stram explains, will involve quantifying and delineating the factors that could affect a prostate cancer machine-learning algorithm’s performance, including variability in histology preparation and differences in color profile or contrast that could result from the scanning process.

The group will be examining the prostate slides via a variety of methods, including using the open-source software HistoQC, which is a quality control tool for digital pathology slides that can quantify the areas of blurriness on slide images and provide information on brightness, contrast, and other aspects of the slides. The HistoQC analyses should help identify issues and differences among slides that could affect machine-learning algorithm performance, Dr. Stram says.

The group plans to rescan the prostate biopsy slides using several different scanners to identify potential variances among whole slide images that are specific to the scanning process or scanner used.

“If you have a slide that is prepared at the hospital and you scan it on scanners A, B, and C, what is the variability for that exact same slide?” she says. “Areas of blurriness, contrast, brightness, the color profile, these are all things that can vary. In this case, the differences would be due to the scanner because we would have scanned the exact same slide.”

While the process of rescanning the prostate slides has not yet begun, the working group, which meets virtually on a monthly basis, intends to eventually share its findings with other CAP committees and the FDA. The latter has long expressed concern about how generalizability in machine learning can impact pathology.

The working group’s long-term plans also involve investigating how to establish the reliability of ground truth labeling, or assigning the correct diagnosis to a slide, based on number of pathologists and their training level and years of expertise. Having the correct ground truth label is very important for training algorithms to assist with diagnosis, Dr. Stram says.

“If you are going to train an algorithm, you need a high degree of assurance that the diagnosis is correct,” she adds. “One of the questions we’ve been asked was, ‘If you want to establish a diagnosis for a slide, how many reviewers do you need to look at the slide, and what level of expertise do they need to have?’”

To answer such questions, the working group is hoping to tap into the CAP’s performance improvement program for whole slide imaging to gain insight into how specific training and years of pathology practice experience influence pathologists’ ability to make accurate diagnoses from whole slide images. The program provides pathologists with sample digital slide images they can use to test whether they have reached correct diagnoses. It also has background data on participants, such as specific fellowship training and number of years in practice.

“If you were to analyze [such] data,” Dr. Stram says, “you could potentially develop a model that says, if you want a certain level of statistical assuredness in diagnoses, this is the kind and number of reviewers that you would want.”

—Renee Caruthers

Accumen announces purchase of Halfpenny Technologies

Accumen has acquired Halfpenny Technologies, a provider of clinical data-exchange and business-intelligence solutions, after a seven-year partnership.

Accumen provides hospitals and health care systems with solutions and services to enhance the value of their clinical lab, patient blood management, and outreach and imaging services.

Accumen, 855-222-8636

Indica Labs and Lunaphore comarket combined solution

Computational pathology software and services provider Indica Labs has announced a partnership with Lunaphore, a Swiss firm developing spatial biology technology for laboratories.

The companies will comarket a solution that combines Lunaphore’s flagship Comet spatial biology platform and Indica’s Halo and Halo AI digital pathology image-analysis software.

Lunaphore’s Comet platform allows researchers to detect up to 40 different spatial markers per tissue slide without human intervention. Indica Labs’ Halo and Halo AI software perform artificial intelligence-based quantitative analysis of whole slide images in a research setting.

“Combining Lunaphore’s superior multiplexing technologies upstream with our powerful AI-based analysis downstream, together we provide a streamlined workflow for high-dimensional imaging and image analysis,” said Steven Hashagen, CEO of Indica Labs, in a press statement.

Indica Labs, 505-492-0979

Oracle discloses plans to develop unified national EHR database

Shortly after Oracle closed its $28.3 billion acquisition of Cerner last month, Oracle cofounder and chair Larry Ellison said in a public virtual presentation that “Cerner and Oracle have all the technology required to build a revolutionary health information management system in the cloud.”

The company plans to build a unified national electronic health records database of anonymized patient data collected from health care entities nationwide, said Ellison in “The Future of Healthcare” on-demand event. The system would continuously upload electronic health records from hospital databases to give providers real-time information about patients’ medical conditions and other clinical data. Patient information would remain anonymized until patients gave their providers access through a “key,” to prevent compromising data privacy and security, according to Ellison.

Furthermore, the database would provide public health officials with anonymized patient health data to generate such statistics as the number of hospital beds available in a geographic area or the number of COVID patients hospitalized within the past 24 hours.

Oracle also plans to update Cerner’s Millennium EHR system, Ellison said. Among the planned enhancements to the system is the addition of a voice user interface to simplify the process of accessing patient data and lab orders. In addition, the company plans to integrate into Millennium a telemedicine module and disease-specific artificial intelligence modules.

“We’re putting all of the diagnostic devices on a single Internet-of-Things network,” Ellison added. “And we’re keeping all of the diagnostic device results—all of the images and all the other data—in a database that we use for machine learning.”

Oracle, 800-633-0738

Gestalt and Hamamatsu install digital solution at Intermountain Healthcare

Gestalt and Hamamatsu Photonics have implemented a unified digital pathology solution at Intermountain Healthcare. The interoperable platform combines Gestalt’s PathFlow digital pathology system with Hamamatsu’s NanoZoomer whole slide scanner.

“Combining the flexibility, reliability, and durability of Hamamatsu’s latest generation scanning technology with Gestalt’s highly adaptable and interoperable image-management and reporting solution has been an ideal pairing,” said pathologist Dylan Miller, MD, coleader of the digital pathology implementation strategy at Intermountain Healthcare, in a press statement. “We are able to meet diverse and dynamic needs across multiple lab sites and pathology groups in our system, as we are rolling out digital pathology, because of this tremendous partnership.”

Gestalt, 509-492-4912

Dr. Aller practices clinical informatics in Southern California. He can be reached at raller@usc.edu. Dennis Winsten is founder of Dennis Winsten & Associates, Healthcare Systems Consultants. He can be reached at dwinsten.az@gmail.com.