On cytology PT, CAP enlightens CMS

CAP Today

May 2009
PAP/NGC Programs Review

The CAP responded March 16 to the Centers for Medicare and Medicaid Services’ call for public comment on its proposed revisions to the cytology proficiency testing requirements. The position of the CAP is that the current regulation and the proposed revisions (as outlined in January in the published notice of proposed rulemaking CMS-2252-P) fail to achieve the stated objective. CAP president Jared Schwartz, MD, PhD, said in his March 16 letter that CMS’ efforts “have resulted in a program that fails to measure competency, is not supported by science, and does not support improved health outcomes.” He urged the CMS to withdraw the proposed regulation “so meaningful alternatives can be considered.” The CAP and other lab and pathology organizations favor the alternative model embodied in legislation that won passage in the House of Representatives and 43 cosponsors in the Senate before the 110th congressional session ended last year.

The CAP provided detailed responses to CMS’ questions about the proposed regulatory changes. The CAP said repeatedly in its comments that if mandatory individual PT is to continue with punitive consequences, important facts must be understood and changes must be made. Here, in edited, abbreviated form, is what the CAP told the CMS in March.

Cytology challenges and new technology

The words “cytology challenge” are sufficiently nonrestrictive and should be used to include future technological advances for screening and interpreting gynecologic cytology. Including specific detailed criteria in the regulations is not advised for pilot testing, diagnostic criteria, or proctor qualifications. It is impossible to imagine, describe, and design criteria for reasonable and timely pilot studies for unknown technologies. Providers should design and execute pilot studies with rigorous statistical analysis based on the methods of future testing employed. Pilot testing undoubtedly would increase the cost of cytology PT, and the additional cost would be passed on to participants, and thus become an additional burden on the U.S. health care system.

Testing individuals

According to HHS, the current statute does not allow educationally based cytology PT. This view is based on a 1988 interpretation of the statute that, to our knowledge, has not been revisited. The revisions currently proposed by CMS do nothing but perpetuate a program that is unable to test and measure proficiency, improve locator and interpretive skills, or protect women’s health. The CAP feels CMS has the ability to implement a program that would alleviate the present problem that has arisen from embedding professional standards into federal regulations.

If CMS continues to reject educationally based alternatives, it should not create a new mandate requiring education in addition to punitive testing, which would increase the burden associated with proficiency testing. If CMS abandons the current and proposed cytology PT program and adopts an educationally based cytology PT test, the agency should not stipulate in the regulations any requirement for participation in an educational program. If CMS were to adopt a rule for mandatory education instead of gynecologic proficiency testing, CMS should not determine the criteria for educational materials. There are more appropriate means of ensuring quality educational requirements (by organizations such as ACCME and the American Academy of Continuing Medical Education, or AACME).

Laboratory directors and deemed lab accrediting organizations should monitor the process because they are in the optimal position to measure individual performance and competency.

Frequency of testing

Individual performance does not indicate proficiency; proficiency is a multifactorial process and the current proficiency testing attempts to evaluate only two tasks: screening and interpretation. Other components of proficiency include judgment regarding when to obtain additional history and consultation with other care providers and colleagues, knowledge of the appropriate confirmatory tests, and judgment about when to consult reference material (items precluded in the current proficiency testing protocol). No single test can assess individual performance in an environment of evolving technology and future molecular testing. The law says there should be “periodic confirmation and evaluation of the proficiency of individuals involved in screening or interpreting cytologic preparations, including announced and unannounced on-site testing of individuals, with testing to take place, to the extent practicable, under normal working conditions.” PT testing, as currently configured, does not provide a “normal working conditions” environment.

Statistical theory emphasizes the shortcomings of the current proficiency testing program. The more challenges per test event, the better the assessment of performance. Competency is properly measured statistically with at least a 100-slide examination to achieve a 90 percent confidence interval. Practical limitations preclude testing on large challenge sets. When choosing the optimal challenges, it is accepted practice to understand the misclassification rates using binomial expansion theory and applying test criteria with examples of rates of both “competent” and “incompetent” individuals. For example, the false pass rates for incompetent individuals (true pass rate at 80 percent) against a 90 percent pass rule drops from 38 percent with a 10-slide (challenge) set to 20 percent with a 20-slide set. Additionally, the misclassification or false failure rate for competent individuals (true pass rate at 95 percent) only drops from nine percent with a 10-slide set to eight percent with a 20-slide test. Changing the size of the test set from 10 slides to 20 dramatically reduces the aggregate misclassification of very poor screeners. Therefore, based on the change in false pass rates for incompetent performers, there may be a rationale to move from a 10-slide testing event to a 20-slide testing event; however, there is not as much benefit in improvement of the false failure rates for competent performers.

Proficiency testing as now practiced and proposed is not a statistically valid way to assess individual performance. In addition to lack of statistical power, there are no published data supporting the assumption that annual testing is needed; there is no evidence that performance significantly declines rapidly over time. This concept is supported by the American Board of Pathology policy of recertification every 10 years.

Data in support of testing at any interval other than annually are impossible to gather under the current program. Because testing occurs every year, there is a “test effect” when evaluating individual performance data at different intervals; individuals actually take the test (and therefore study/prep for the test) in the intervening years. Since no population of cytologists exists outside of the current annual testing system, no population of practitioners is available to study other testing intervals. The CMS should initiate a formal process using the Federal Advisory Committee Act to determine how proficiency should be evaluated within the current statute, and how often testing should be conducted.

Number of cytology challenges

Logistical concerns associated with a 20- to 100-slide exam include, but are not limited to: quality control; testing in terms of the number of consecutive days the lab is allowed to test and the number of examinees per slide set/ challenge set; and the extended time needed to conduct the test. This will affect all laboratories; especially hard hit will be small laboratories with limited staff.

If there is a 20-slide challenge biennially, the impact on laboratory operation would be excessive. While testing, patient care services are disrupted and delayed. Testing removes physicians and cytotechnologists from their usual patient care activities. Routine Pap test screening, QC Pap test screening, nongynecologic cytopathology, surgical pathology, and clinical lab responsibilities are all interrupted by this process. A four-hour time frame underestimates the disruption of clinical services in laboratories where central screening is performed and a pathologist interprets and reports the Pap test at a remote site. Because this practice constitutes “normal working conditions” and is acceptable under the law, if the CMS implements a four-hour testing limit, it should also reconsider its position on cytology proficiency testing referral in this setting. It represents an onerous disruption of clinical care and affects patient safety.

Response categories

Criteria for an unsatisfactory specimen should not be defined in regulations. Criteria change as understanding of the disease process grows and our ability to recognize clinically significant changes improves. Criteria should be established by expert consensus, not by regulation. If the regulations continue to add more specificity, then they will become outdated again as technology advances.

A definition of adequacy is widely available and used, and under “normal working conditions,” these references can be accessed at any time. However, one of the current proposed definitions for “unsatisfactory” is incorrect, specifically: “absence of endocervical/ transformation zone component.” A specimen that has the minimum number of squamous cells and has at least 25 percent of the specimen unobscured should not be identified as unsatisfactory if it does not have an endocervical or transformation zone (T-zone) component. In normal working conditions, specimens are not signed out as “unsatisfactory due to lack of endocervical/transformation zone component.” This information is provided only as an explanatory note in the specimen adequacy section of the patient report, not in the interpretation or general category. In the 2006 American Society for Colposcopy and Cervical Pathology guidelines, clinicians are advised not to call the patient back for a repeat Pap test when the Pap test is satisfactory but does not contain evidence of T-zone sampling. The patient is to continue with routine screening in one year if no clinical symptoms are present. (Davey D, et al. J Low Gen Tract Dis. 2008:12:71–81.)

If mandatory individual PT is to continue, no fifth response category should be required. The response categories for PT should reflect the biologic understanding and the clinical management of women at risk for cervical cancer. There should be three response categories reflecting the Bethesda “General Categorization” (unsatisfactory, negative for intraepithelial lesion or malignancy, epithelial cell abnormality). In the context of patient management, the CMS should reconsider the position of rejecting the Cytotechnology Education and Technology Consortium and American Society for Cytotechnology suggestion of combining LSIL (category C) and HSIL (category D) into one category. LSIL and HSIL constitute a morphologic spectrum, especially when the underlying lesion is CIN 2. The CAP Practical Guide to Gynecologic Cytopathology: Morphology, Management, and Molecular Methods reiterates this with the following statement: “In real practice situations, morphology represents a spectrum of change without sharp cut-offs between entities.”

Current patient management guidelines dictate that both categories (C and D) require the patients to be referred to colposcopy (depending on patient age). We recommend combining C and D into one category: “epithelial cell abnormality: further patient management required.” Since the Pap test is a screening process that requires patient triage and appropriate followup, a three-tiered response category best reflects current clinical practice. Using a three-tiered system beginning in 1996, the CAP Interlaboratory Comparison Program allowed the CAP Laboratory Accreditation Program to monitor the performance of laboratories; those laboratories that failed to obtain a 90 percent score during a testing cycle were identified, held accountable, and required to take corrective action.

There is an apparent incorrect assumption that the Pap test is a diagnostic test. It is a screening procedure that can identify patients who require further clinical management. It is unrealistic to assume that these response categories carry the same significance as a confirmatory diagnostic test.

Cytology challenges referencing

The initial review by anatomic pathologists serves as the entry point for a challenge into PT testing. The robustness of the challenge comes from the field-validation process. A blinded (unmarked) initial review is not required to identify those challenges that may perform well; the field validation establishes whether the challenge is referenced into the correct response category. The challenges should be initially evaluated and validated in the manner that they will be evaluated in the testing situation. Therefore, slides/challenges should be first screened by cytotechnologists, marked, and then evaluated by pathologists.

We recommend that cytotechnologists be included in the review process. Cytotechnologist screening, with review by three pathologists, constitutes the best method of entering challenges into the field-validation process.

Biopsy confirmation

No biopsy confirmation should be required on LSIL challenges. Studies have shown that LSIL is the most reproducible category in cytology—far more reproducible than CIN 1 biopsies. (Stoler MH, Schiffman M. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL Triage Study. JAMA. 2001;285:1500–1505). There are many reasons why the biopsy does not represent the gold standard for LSIL verification. These reasons include: resolution of HPV infections prior to the biopsy being performed, sampling issues during colposcopy, and variable reproducibility of CIN 1 histologic interpretation. The reference category for each individual challenge is determined by the field-validation process and ultimately not by biopsy confirmation. Therefore, biopsy confirmation should not be required for any PT challenge.

Validated cytology challenges

Only field-validated challenges should be used. “[Field-] validated slides showed higher concordance: laboratories 98.3 percent, pathologists 96.6 percent, and cytotechnologists 97.9 percent” (1997 Pap Year End Summary). The CAP has published many papers in relation to the slide characteristics of cases that have performed well and poorly in the CAP Interlaboratory Comparison Program from 1999 to 2005.

Stipulating validation criteria in the regulations would not allow flexibility when new technologies are introduced and our understanding of disease process changes. Criteria for field validation should be established by rigorous statistical analyses appropriate for each challenge. Validation criteria used for challenges must be transparent and must be made available to participants and any other interested parties.

Cytology challenge performance should be monitored continually. It is the CAP’s experience that challenges used in cytology PT suffer decline in performance due to technical factors. Monitoring criteria should be independently established by the PT provider in a manner appropriate for the technology of the challenge. Regulatory stipulation of criteria would not allow flexibility when new technologies are introduced. Any new validation requirement not now included in the regulations will add costs to the proficiency testing program.

Scoring scheme

Automatic failure should not occur if the participant responds with a category B (normal or benign change) when the field-validated challenge is category D (or epithelial cell abnormality). Statistically, the misclassification or false failure rate for competent individuals (true pass rate at 95 percent) is nine percent with a 10-slide set or eight percent with a 20-slide test. A single slide error is more likely a result of a false failure than an incompetent practitioner. Since the Pap test is a screening modality that requires patient triage and appropriate followup, a three-tiered response category best reflects clinical practice.

The unified three-tier scoring grid {see chart (PDF 901 KB)} includes cytotechnologist and technical supervisor (pathologist) responses for 20 challenges.

The pathologist and cytotechnologist should be scored using the same grid, regardless of whether the CMS or the CAP’s proposed scoring scheme is considered. By using a three-tiered system, the locator (cytotechnologist) and interpretive (technical supervisor) skills are being tested at the same time. In the CAP-proposed three-tiered system, both locator and interpretive skills are used to: 1) locate epithelial cell abnormalities, and 2) confirm epithelial cell abnormalities as is the practice under “normal working conditions.” This approach recognizes the morphologic spectrum of intraepithelial lesions and cancer.

Re-testing and remediation

We recommend that the PT provider include at least the following information for the participant: number of the case missed, participant’s response by category, and reference response by category. For those who do not score 90 percent or better, the information related to the incorrect response categories should be given to the laboratory director who is responsible for providing documented remedial training and education in the area of deficiency on subsequent re-tests.

In a mandatory educational testing model, test participants should receive timely feedback, including potential areas of improvement, confirmation of correct responses, and additional clinical or morphologic followup, as available.

Appeals process

Criteria for appeals should not be established by the CMS and/or written in the regulations. Each provider should establish its own appeals process, and that process, whatever it is, must be transparent and available to the participant.

The CAP has always made the appeals process available to participants. Only scores less than 90 percent are considered for an appeal. The appeal includes at least three referees (similar to clinical lab PT requirements). If at least one referee disagrees with the reference category, the appeal is granted, and the participant receives a written response that includes the explanation that the challenge is removed from future PT events (it may be included in educational activities only) (Crothers BA, et al. Arch Pathol Lab Med. 2009;133:44).

Proctors

Regulations should not specify criteria for the proctor. While the CMS must approve the criteria, training, and oversight of proctors in PT provider applications, placing specific criteria in the regulations does not allow for improving the proctor process