Home >> ALL ISSUES >> 2013 Issues >> Cytopathology and More | Cytologic-histologic correlation: And the answer is…

Cytopathology and More | Cytologic-histologic correlation: And the answer is…

image_pdfCreate PDF

where a true positive is a positive correlation pair and a false-positive is a positive Pap test with a negative biopsy. Notice that the PPV is based on the original interpretation for both the Pap test and the biopsy, and not the review interpretation of these specimens. The calculation assumes the biopsy is the gold standard of “truth.” The PPV emphasizes the screening role of a Pap test. It is intended to identify women who require triage to colposcopy to confirm a potential abnormality through visual inspection or biopsy or both. To ensure meaningful data, a minimum of 20 total correlation pairs is necessary to calculate PPV.

One reason for the superiority of PPV over metrics such as sensitivity and specificity is that it uses easily retrievable data. Sensitivity and specificity rely on knowing the false-negative (sensi-tivity=true positives/true positives + false-negatives) or true negative (specificity=true negative/ true negative + false-positive) results. These data are difficult to accurately measure because most women with negative Pap tests are not biopsied. False-positive Pap tests are probably overrepresented because patients are referred for biopsies. The PPV is a measurement that is close to the percent of positive Pap tests that correlate with biopsies. This was the most frequently measured CHC statistic in the laboratory survey. According to CAP Q-Probes data from 2005 to 2010, the median PPV is 83 percent to 88 percent, with a range of 71 percent to 94 percent.2 It is important to emphasize that the PPV is a laboratory, not an individual, metric. It would be difficult to obtain an accurate PPV for individuals except in laboratories with a very high volume. Additionally, the PPV does not indicate truth. Review of CHC slides often reveals interpretive or processing errors in both specimens that should not be held against individuals.

If the laboratory’s PPV is low relative to benchmarks, it should investigate Pap interpretive accuracy and intradepartmental variability as part of its QA program. If a laboratory’s PPV is high, it may indicate that the laboratory is identifying only the most obvious lesions and under-recognizing subtle changes. It may also indicate that health care professionals are not sampling subtle colposcopic lesions or are not sampling the transformation zone.

It is desirable to provide timely notification to a caregiver for confirmation of a negative biopsy and HSIL or cancer (HSIL+) Pap test, or of a negative biopsy and an HSIL or cancer Pap test re-interpreted as NILM (negative for intraepithial lesion or malignancy).

There are significant followup implications for patients with a cytology interpretation of HSIL—most will have an ablative procedure or excisional biopsy. An unintended consequence of cervical cone excisional and LEEP procedures is cervical incompetence. When biopsies are negative, informing the health care provider that a Pap test was correctly interpreted as HSIL or cancer (HSIL+) after a second review enables him or her to proceed with appropriate ablative therapy with confidence. If the Pap test review yields a mistaken interpretation of HSIL+, unnecessary surgery is prevented. In some cases, consensus regarding the initial Pap test interpretation of a high-grade lesion is not achievable and a diagnostic excisional biopsy will be indicated. There was no consensus opinion on the definition of “timely” notification, but notification should occur as soon as is feasible after the microscopic review of both specimens. Discussions with the health care professional should be documented in the biopsy or cytology report or in a separate QA document.

Laboratories should attempt to obtain correlation biopsy information for all patients with an HSIL or cancer Pap test.

It is a challenge for some laboratories to obtain Pap test or biopsy results if they process and interpret only one or the other specimen type, but for correlation purposes, they should attempt to gain biopsy followup information for all patients with an HSIL+ Pap test. This serves two purposes: It ensures that patients with an HSIL+ Pap test obtain appropriate colposcopy, and it allows the laboratory to confirm its accuracy of an HSIL+ interpretation. Requests for followup information may be by a note in the Pap test report, telephone, e-mail, or other means. Laboratories that process both specimen types from the same patient should request followup information from the health care professional if no biopsy or report of colposcopy is documented six months after the incident Pap test. Finally, laboratories should document attempts to obtain followup information, and the method used to request followup should be made part of the written QA program.

Microscopic review of all slides from discordant Pap test/cervical biopsy pairs (as laboratory-defined) is desirable for CHC.

Even though calculation of the PPV does not require microscopic review of Pap test and biopsy pair mismatches, this exercise is the most rewarding and revealing of the entire process, and laboratories should record review findings in a QA document or specimen report. Review of negative Pap slides when a biopsy is interpreted as HSIL may reveal reasons for interpretive error, such as Papanicolaou stains that are too dark for optimal examination of chromatin, processing problems that obstruct diagnostic criteria, or sampling problems that result in incomplete collection of cells or obscuring factors that hinder correct interpretation. It is primarily through this process, and not calculation of PPV, that laboratories will find quality improvement projects that will enhance their performance. If review of all discordant Pap test/cervical biopsy pairs is not possible, the review should focus on HSIL-normal mismatches for both Pap tests and biopsies. It may be futile to review mismatches in LSIL-normal cases because LSIL lesions regress and appear at uncertain intervals and one would expect mismatches that are not the result of interpretive, sampling, or processing errors. However, HSIL is usually a persistent lesion and the ramifications of a mismatched pair are more severe.

If all of the slides in a mismatched pair are not available, those that are available should be reviewed and the original interpretation on unavailable specimens will be assumed to be correct. Laboratories may define their own non-correlation metrics for QA purposes. For example, a laboratory may want to monitor and review all atypical squamous cells, cannot exclude high-grade squamous intraepithelial lesion (ASC-H) and corresponding biopsies to determine the percent of cases with a significant biopsy finding, and then review those Pap tests where the biopsy was interpreted as HSIL+ to determine whether there are features present that would prompt cytologists to interpret those cases as HSIL in the future.

CHC is optimal with a multilayered approach.

Developing a CHC program that meets the laboratory’s needs and addresses perceived laboratory problems is an ideal toward which we all strive. A multilayered approach to CHC allows for customization as well as standardization. Laboratories can drill down on particular areas of concern by developing continuous and interval monitors.

One example of a continuous monitor would be the PPV. An interval monitor may target specific pairs for a predetermined time, for example quarterly, to acquire a snapshot of laboratory performance for that indicator. Continuous monitors may be desirable when laboratories experience high personnel turnover, disruptive environments, or other variables such as new instrumentation that can cause a quality drift.

Corrective action for variances can also be creative. The most popular and favored method of investigating and improving interpretive variances among consensus participants was to review slides in a group. Not only does this method encourage discussion and expose all observers to difficult cases, but it can occur in a non-threatening environment where the participants are unaware of the identity of the original interpreters. A group discussion of mismatches and slides encourages uniformity of interpretation, leverages group experience, and allows observers to share diagnostic clues and practices.

Another layer of CHC is to optimize biopsies during review. Studies have shown that biopsy specimens are often the reason for a “false-positive” Pap test result and additional processing may unveil a cervical lesion.3 Reorienting tissue in the block, obtaining additional levels, performing ancillary studies such as p16, and recording the presence or absence of the transformation zone are all methods of optimizing biopsy performance. Providing sampling data to health care professionals who perform colposcopy and biopsy may help improve biopsy sampling. Laboratories can develop trend-based policies to improve internal practice, such as standardizing the number of levels and serial sections on cervical biopsies and endocervical curettage, pinning LEEP and cone specimens flat to optimize embedded sections, and teaching histotechnologists to recognize ectocervix to embed cervical biopsies properly. A laboratory may choose to monitor characteristics of biopsies over time to troubleshoot mismatches in CHC when the Pap tests appear accurate by recording the presence or absence of a transformation zone, biopsy sizes less than 2 mm, colposcopies with only one to two biopsies, poor biopsy orientation, and requests for additional levels.

Pap test interpretation is most often the focus of CHC slide review but other factors are just as guilty of causing error. For example, a laboratory may choose to record the quality of Pap slides in CHC mismatches, including staining and processing irregularities. There may be patient factors that contribute to interpretive error, such as atrophy, obscuring blood or inflammation, infection, or inadequate shedding of abnormal cells. Some patterns of HSIL are notorious for causing interpretive errors—hyperchromatic crowded groups and small individual HSIL cells with bland nuclei.

Curiosity may prompt further CHC investigations. For example, how often does your laboratory have an LSIL Pap test but an HSIL biopsy? Was the Pap test interpreted as LSIL because of few HSIL cells on the slide, or are HSIL cells usually absent? How many ASC-H Pap tests have an HSIL biopsy, and what does review of those Pap tests reveal? Other pairs that might be interesting to monitor to improve laboratory performance are AIS/LSIL, atypical squamous cells of undetermined significance (ASC-US) with a positive test for human papillomavirus (HPV+) and a SIL biopsy, ASC-US with a negative test for HPV and a SIL biopsy, atypical glandular cells (AGC) and subsequent endocervical or endometrial biopsies, and HSIL Pap tests in pregnant or postpartum women. Any of these monitors can be periodic or continuous, depending on other laboratory metrics or conditions.

When reviewing slides for CHC, minimize observer bias. Such bias occurs when the observer tends to believe the result of one test more than the other, or is influenced by the result on one test when reviewing the other. There are several ways to prevent bias. If there is disagreement between the reviewer and the primary cytologist, one can obtain an additional opinion. If retrospective review is performed, all slides can be randomly combined and the reviewer blinded to the original results, with unveiling of the original results only after review. Cytotechnologists can review all of the Pap tests and pathologists review all of the biopsies. For real-time correlation, all mismatches can be triaged to hierarchical peer review, or specific interpretations such as HSIL Pap tests and biopsies may be referred. Finally, all discrepancies can be reviewed together in a consensus conference for a group decision.

Summary
What is the point of CHC if the data are never used or if the primary stakeholders don’t have access to the data? Laboratories have a wealth of information at their disposal if they manage it effectively. The CHC Working Group and consensus participants agreed that CHC should not be unnecessarily proscriptive, because laboratories face different problems and need to tailor their approach to CHC to target potential problem areas.

In an ideal, high-quality performance environment, laboratories would receive both cytologic and histologic specimens from the same patient and be able to correlate these results to improve patient outcomes. That is not possible for most laboratories because care is fragmented and they do not usually have control over what specimens they receive. The guidelines suggested in this article are minimum guidelines for CHC that most laboratories can perform and that allow them to compare their performance against national benchmarks compiled from all laboratories.

References

  1. College of American Pathologists Gynecologic Cytopathology Quality Consensus Conference Working Groups 1-5. Special Section—College of American Pathologists Consensus Conference on Gynecologic Quality. Arch Pathol Lab Med. 2013; 137(2):158–219.
  2. Jones BA, Novis DA. Cervical biopsy-cytology correlation. A College of American Pathologists Q-Probes study of 22439 correlations in 348 laboratories. Arch Pathol Lab Med. 1996;120(6):523–531.
  3. Bewtra C, Pathan N, Hashish H. Abnormal Pap smears with negative follow-up biopsies: improving cytohistologic correlations. Diagn Cytopathol. 2003;29(4):200–202.

Dr. Crothers, director of cytopathology, Department of Pathology, Walter Reed National Military Medical Center, Bethesda, Md., is chair of the CAP Cytopathology Committee. She was chair of the GCQC2 Cytologic-Histologic Working Group.

CAP TODAY
X