Bringing down defects in surgical pathology reports

 

CAP Today

 

 

 

May 2012
Feature Story

Anne Paxton

Defects in surgical pathology reports are often relatively harmless mistakes, but at their worst they can be drastic errors that do patients harm. How well are surgical pathology laboratories preventing such defects and discovering them when they occur? A new CAP Q-Probes study—the largest one so far on this quality measure—provides a glimpse of the performance of institutions of many sizes and kinds, and offers a few suggestions for how they can keep defects to a minimum.

“Surgical pathology is a very, very labor-intensive process that involves many steps, and we know from day-to-day practice that errors occur,” says study co-author Raouf Nakhleh, MD, professor of pathology at Mayo Clinic in Jacksonville, Fla., and chair of the CAP Quality Practices Committee.

“The surgical pathology report reflects what happens in the laboratory, so we knew defects in the reports would give us a really good overview of where the problems were in the process.”

The study, “Surgical Pathology Report Defects,” compiles results of 73 participating institutions’ reviews of their own surgical pathology reports, and found an overall defect rate of 4.7 per 1,000 reports and a median rate of 5.7 per 1,000 reports. “This is relatively high compared to some prior studies, which have shown rates between 1.4 to 4.8 per thousand,” says study co-author Keith E. Volmar, MD, a pathologist at Rex Hospital in Raleigh, NC, and a member of the Quality Practices Committee. “Despite the growing emphasis on QC in surgical pathology, it looks like there’s still quite a bit of room for improvement.”

Focusing on surgical pathology reports that were corrected after they were released, the institutions participating in the Q-Probes conducted a systematic prospective review. Each reviewed their surgical pathology reports for three months or until a maximum of 50 defects were identified. A total of 1,688 report defects were found in 360,218 surgical pathology accessioned cases.

Report defects were separated into four categories for the purposes of this study: misinterpretation, misidentification, specimen defect, and other defects. Here is what the participating labs found:

  • 14.6 percent of the defects were misinterpretations (inaccurate diagnoses and misclassifications as well as revisions in secondary diagnostic elements like tumor grade or margin status).
  • 13.3 percent were misidentifications (errors in patient identification, tissue identification, specimen laterality, and specimen anatomic localization).
  • 13.7 percent were specimen defects (lost specimens, inadequate specimen volume or size, specimens with inadequate or discrepant measurements, inadequately representative specimens, and specimens with inadequate or absent ancillary studies when such studies were warranted).
  • 58.4 percent were other defects (missing or erroneous nondiagnostic information, dictation or transcription errors that do not affect the diagnosis, and failures in electronic formatting or transmission of reports).

“The median defect rate of 5.7 per 1,000 is higher than previous studies have, in general, reported,” says study co-author and Quality Practices Committee member Frederick A. Meier, MD, senior staff pathologist in the Division of Anatomic Pathology at Henry Ford Hospital, Detroit, and director of regional pathology services, Henry Ford Health System. “I suspect this is because of the larger number of participants in the Q-Probes study, producing a wider range of performance.”

The study found three significant associations between institutional variables and the performance indicator of surgical pathology report defects per 1,000 reports. First, higher rates of defects tended to occur in laboratories with a pathology resident/fellow training program. Second, lower rates of misidentification defects tended to occur in laboratories in which all malignancies are reviewed by another pathologist before sign-out. And third, lower rates of specimen defects tended to occur in laboratories that have intradepartmental review of cases after sign-out.

Given the additional workload, the inexperience of trainees, and the high complexity of cases seen at academic medical centers, the study authors believe that the association between residency programs and higher defect rates is a logical one. The other two associations indicate that double review is a valuable QC practice, they say. “Double review of cases is generally thought of as a mechanism for avoiding misdiagnosis, but it is also an effective second check of protocols and slides that can discover labeling issues,” the authors write in their analysis of the results.

Says Dr. Meier: “The higher rate of defects in departments with residents is often a matter of bureaucratic miscommunication—whether the right person gets to look at the report before sign-out, or one pathologist gets a phone call and another pathologist signs out the case, or there’s no communication about what was learned at a tumor board, and so on. We need to be aware that communicating well before a report is signed out is important when more people are involved, which is what you have when there are residents.”

The lower rate of misidentifications in departments that undertake second review is a consistent finding, he says. “Pathologists undertake second review before sign-out to save themselves from misinterpretation, but they also discover misidentifications where cases are switched. Misidentifications, in the Q-Probes study, turned out to be the sharpest end of the wedge.”

“The median defect rate of 5.7 per 1,000 [reports] is higher than previous studies have, in general, reported,” says Dr. Meier, above with quality improvement specialist Ruan Varney, CT, CQE, SSBB.
The median defect rate of 5.7 per 1,000 [reports] is higher than previous studies have, in general, reported,” says Dr. Meier, above with quality improvement specialist Ruan Varney, CT, CQE, SSBB.

In his experience at Henry Ford Health System, double review of selected cases does indeed stem misinterpretations. “Specifically, it achieves this objective in diagnoses with ‘low end gray zones’—commonly breast carcinoma, prostate carcinoma, esophageal dysplasia, or cervical neo-plasia. In these settings, it’s a matter of ‘getting the band together’ to agree whether gray is white or black. However, a big, less appreciated benefit of the fabled ‘second pair of eyes’ turns out to be a second brain behind the pair of eyes figuring out whether or not it’s the right patient.”

Dr. Meier thinks the lower rate of specimen defects in places that do post-sign-out review is a new finding with this study. “It’s an intriguing finding for which I don’t have a quick, plausible explanation. Post-sign-out review may be stimulating pro-cess improvement, and that’s something that I don’t think has come out before,” he says.

As another po-tential source of process improvement, the study authors cite six-sigma bench-marking, used in industry, as an avenue for determining whether error rates are higher than they should be. The aggregate defect rate in this Q-Probes study, the authors note, equates to 4,686 defects per million, corresponding to a sigma metric of 4.1 “Most industrial processes fall into the 3–4 sigma range, and many consider the goal of healthcare processes to be a sigma metric of 5. With this in mind, the report defect rate from this study indicates room for improvement,” they write in their analysis. Dr. Meier is an enthusiastic advocate of the six-sigma benchmark. “This is really the way we should start thinking about quality control. We’re producing information, so we have to look at this the way information scientists look at it.”

Having developed the taxonomy of defect categories that the study authors used, Dr. Meier is pleased that the fractions of defects in this study that are misinterpretations and specimen defects were lower than in some previous studies on which he has worked. The institutions from those earlier studies initiated improvements in their QC that have resulted in a larger fraction of their defects being nondiagnostic defects. “This is an important finding. It suggests that among the Q-Probes participants some institutions are doing much better than others in decreasing these two sorts of defects.” In part, he believes, this reflects the fact that people who study something on a regular basis tend to be more vigilant about it than those who don’t.

Those who have not been inside a histology laboratory may not understand just how complex and vulnerable to error the processes performed there are, Dr. Nakhleh says. “Specimens may be mislabeled before we receive them, they have to be accessioned, then blocks are generated, tissue is embedded in paraffin and transferred onto slides. A specimen could generate 10 or 20 blocks and double that number of slides. So a lab with 50,000 specimens could generate 100,000 blocks and maybe 200,000 slides. These are enormous numbers, and mixups could happen at any point in the process, resulting in errors.”

The study found that 24.9 percent of errors were detected through review of the report text after the case was released, Dr. Nakhleh notes. Under a typical scenario, “A specimen and a protocol are given to a pathologist. The protocol says ‘skin,’ but the slide is stomach. A specimen may be a series of colon biopsies from a patient, but one doesn’t fit in that series—it’s liver or another organ. Sometimes you’re just looking at numbers and names on the slides and protocols and they don’t match.”

While the system has safeguards, Dr. Nakhleh says, “we’re always dependent on people double-checking that things are correct.” With bar coding, for example, a previous study found that many institutions use bar codes on the specimen labels, but only a small percentage of places use bar codes on slides and on blocks. “Laboratories need to have a system where at every point you could use a bar code to generate the next step. Most places don’t have systems to generate bar codes for blocks automatically. They might have automatic block labelers or separate systems that generate the slides, but they produce them in a batch instead of one case at a time. And research has shown that if you do something in a batch, you’re more likely to make errors than if you do it one case at a time.”

Dr. Nakhleh: “Research has shown that if you do something in a batch, you’re more likely to make errors than if you do it one case at a time.”
Dr. Nakhleh: “Research has shown that if you do something in a batch, you’re more likely to make errors than if you do it one case at a time.”

Ideally, the process improvement Dr. Nakhleh describes should apply to the detection of errors as well. But as the Q-Probes study notes, when errors in surgical pathology reports are discovered, there is often an element of randomness involved. The most common means for detecting defects was through review of the report text after sign-out (24.9 percent), followed by “don’t know” (16.4 percent), and “clinician requested review of the case” (11.4 percent). Detection through case review for tumor conference (3.6 percent) was roughly equal to detection through “chance” (3.3 percent), the authors note.

That was a finding Dr. Volmar did not expect. “I was surprised that the rate of detection of report errors at tumor boards and conferences was the same as the rate of detection by simple chance, because we tend to think that review at multidisciplinary conferences would pick up more mistakes.”

“Sometimes defects are discovered by the clinician, and in many cases they’re found by the Department of Pathology,” says study co-author and Quality Practices Committee member Michael O. Idowu, MD, MPH, associate professor of pathology at Virginia Com-mon-wealth University and director of breast pathology and quality management, Division of Anatomic Pathology, VCU Health System.

At Rex Hospital, Dr. Volmar says, “we try to catch a lot of potential errors by simply proofreading working report drafts before the case is reviewed by the pathologist. A transcriptionist goes through and checks the demographic information and proofreads the clinical data and gross description in the case report, and catches a lot of typos. I think this approach cuts down on many nondiagnostic errors in our practice.”

Regarding the finding that departments with training programs tend to have more defects in surgical pathology reports, he says, “Some academics would say this adds fuel to the argument that they need better reimbursement rates for their work, because additional labor is needed in a training program to ferret out all of the defects.” Since labor shortages are nearly universal, Dr. Volmar feels fortunate that his own hospital doesn’t consider reducing report defects just his problem; rather, it’s a joint task in which the hospital cooperates. “Sometimes hospitals say, ‘Report QC is the pathologist’s responsibility, not our problem, and we don’t want you wasting our transcriptionists’ valuable time to proofread.’ I am fortunate that my hospital supports having a QC program, and technically the proofreaders are not my employees.”

Beyond proofreading, however, Dr. Volmar does not see many simple answers to the problem of errors. “The biggest obstacle is that everyone is under time pressure and many departments have staffing issues, and whether you are going to review reports before sign-out or after sign-out—all of that takes time. Pre-sign-out review generally requires physical distribution of cases to other pathologists. That’s a logistical problem if your group is spread out over multiple locations. It’s also a turnaround time issue for many of us if we’re pressed to get our biopsies out in 24 hours.”

Similarly, conducting a pre-sign-out review of all first malignant diagnoses or all difficult cases is a great idea, Dr. Volmar says, but its practical application depends on the practice setting. “If you’re a solo practitioner, who’s going to be second reviewer? You’re either going to have to have an agreement with a neighboring hospital or send cases to a central lab of some sort, and that takes time and costs money.”

Adding second identifiers to tissue blocks is a safety measure that is less commonly done. The study found that 75.7 percent of participating laboratories report a second identifier on glass slides, while only 40 percent include a second identifier on surgical pathology blocks. But the authors suggest that such a measure might help. “But, again,” Dr. Volmar says, “systems that apply multiple identifiers on blocks and slides are expensive. It can be difficult to convince hospital administration to commit funds to a problem that is largely invisible to personnel outside of the lab.” In time, he predicts this second identifier is going to be a CAP checklist issue.

As with all statistics, good results may not be what they appear, and poor results may not be as bad as they look. This Q-Probes study cautions that some laboratories with apparently low error rates might be overlooking errors. “It’s often said in pathology: The more you look, the more you see,” Dr. Idowu says. “Most systems, big or small, have errors; you’re not likely to find there is a perfect system. So if you are reporting your lab is great and you have the least errors, it just might be that you’re not looking hard enough.”

Different practice settings and workloads may be a factor in error rates, in Dr. Idowu’s view. If pathologists have a heavy workload—if they are looking at 60 to 80 cases a day, for example—there may be more room for error. “But occasionally it might be that you expect someone else to make sure the reports have no defects. CAP actually requires that some of the quality checks be done, but there may not be an understanding of who’s doing it. So everybody needs to cultivate a culture that they all have to be part of QC, that it’s everybody’s responsibility, not just the responsibility of a particular group of people.”

Part of that culture must be an awareness that every defect can potentially harm a patient, he says. Misinterpretation is a category that can include low-impact errors. “For example, if there is a benign breast lesion labeled a fibroadenoma and it turns out to be adenosis, that’s not going to kill the patient, but that is a misinterpretation because you gave it a wrong name even though it had the right category of meaning: benign.”

However, a fairly significant number of diagnosis changes made because of a report defect did have a high likelihood of clinical significance, the Q-Probes data show. “Those defects make up 47 percent of our misinterpretations,” Dr. Volmar says, “and we can infer they would have been clinically significant even if we don’t have direct data on it.”

Nevertheless, the study found that false-negative cases were just 1.7 percent of the total defects (or eight false-negatives in 100,000 accessioned cases), and false-positives were 0.8 percent of total defects (or four false-positives per 100,000 accessioned cases), and those figures may help keep the big picture in perspective, Dr. Idowu points out. “Those are small numbers in the grand scheme of things, although they may have a significant impact on patient care. What we want to get across is not that people aren’t trying, but just that we can try more to minimize these errors even further.”


Anne Paxton is a writer in Seattle.