Q & A

April 2012

Editor:
Fredrick L. Kiechle, MD, PhD

Q. Regarding the subject of minimal residual disease (MRD), how does one reconcile the discrepant percent of residual blasts among morphology, flow cytometric analysis (in-house), FISH, PCR, and, lastly, the flow cytometric analysis result reported by Children’s Oncology Group? We have several cases in which the in-house flow, morphology, and FISH were negative, while the COG report was positive for MRD. What is the acceptable percent range of discrepancy among laboratories for flow cytometric analysis?

A. Minimal residual disease can be detected by a variety of techniques and the same approach can be applied to virtually any disease process, provided there is a distinguishing biomarker to differentiate normal from neoplastic cells. All methods, including morphology (~1 percent threshold of detection), FISH (~0.1 percent threshold of detection), nucleic acid amplification (~0.001 percent threshold of detection), and flow cytometry (~0.01–0.001 percent threshold of detection), have varying sensitivities and sources of interference. Additionally, different methods are better suited to particular hematolymphoid malignancies. Thus, it is increasingly recognized that no single technology can be universally recommended; often a combination of testing modalities is required to accurately determine the level of MRD.

The factors contributing to differences in MRD detection among different labs include not only differences in the method(s) chosen, but also methodologic differences in applying a particular technique, preanalytical differences in obtaining sample (as, for example, in testing different aliquots of bone marrow from different aspirate samples), and variability in the experience and competence of laboratories performing the testing. The intrinsic biologic diversity of neoplasms as well as changes that can occur in phenotype and genotype after therapy further complicate all of these variables. Literature comparing methods can be confusing, and in some cases the confusion is traceable to competency-related differences in using the various techniques or subconscious bias in the study design favoring one technique during the comparative study. Moreover, even in studies showing excellent correlation between methods or laboratories, there are always occasional samples that give inexplicable highly divergent results.

The COG acute lymphoblastic leukemia (ALL) flow cytometry MRD studies are performed in two laboratories highly skilled in flow cytometry, one directed by Brent Wood, MD, PhD, at the University of Washington and the other by Michael J. Borowitz, MD, PhD, at Johns Hopkins. These two labs have standardized procedures that have been validated not only analytically but also clinically to ensure similar diagnostic sensitivity and accuracy between the two testing sites. Dr. Wood’s summary of the issues follows, in his own words:

Despite all these potential causes for discordance, in most cases discrepancies in enumeration are ±1 log in my experience once denominator effects are taken into account. More specifically:

Morphology versus flow. Morphology has limited sensitivity and specificity and is clearly inferior to flow in most cases. Specifically related to ALL, there are a subset of cases in which the blasts acquire a more mature morphologic appearance after therapy that makes them difficult to distinguish from mature lymphocytes and thus morphology can grossly underestimate them (we saw an in-house example of this recently). I’m not sure whether this is well documented in the literature, however. Different denominators are commonly used (white cell events for flow and all nucleated cells for morphology; COG ALL studies use mononuclear cells but I’m not certain about COG acute myeloid leukemia studies [Loken]), and each is subject to its own bias for enumeration due to hemodilution, distributional, and processing artifacts, particularly in bone marrow.

FISH versus flow. Concordance is generally good, but enumeration can be problematic, particularly if cultured cells are used as the starting point for FISH. Some abnormalities seen at diagnosis by FISH may be not present in residual abnormal blasts after therapy, probably due to therapeutic selection, although this is probably relatively infrequent. We conducted a study directly comparing flow and FISH MRD in a large number of samples from patients post-transplant for AML (Fang M, et al. Cancer. 2011;Sep 16. [Epub ahead of print]), and while the concordance is generally good, there are clearly discrepancies in both directions (more FISH– flow+), but the clinical outcomes appear essentially the same comparing the discordant categories to concordant, that is, MRD detected by either technique is equally bad.

PCR versus flow. PCR is clearly one log or so more sensitive than flow when either translocation analysis or IgH/TCR patient-specific primers are used. Enumeration with PCR is problematic and it can be difficult to directly compare qPCR with flow numbers. The molecular abnormalities targeted by PCR assays may not always be retained following therapy, and thus false-negative results occur in a subset of patients. Flow offers a more integrated assessment of abnormality and so may be less dependent on specific molecular lesions, but this has not been rigorously shown. I think laboratory competence in performing each of these assays is a major factor in discordance. Both PCR and FISH are somewhat more objective to interpret, and the technical factors are reasonably well understood and somewhat standardized, so that results for either are less likely to be subject to technical or interpretive artifact. On the other hand, flow cytometry is largely unstandardized, with most labs not having appropriate experience with MRD to be truly competent. There is no widely available proficiency testing or other reference method for comparison (aside from those discussed here, each with its own problems), so labs generate results that are largely unsubstantiated. As a result, I think the quality of MRD data from flow labs is highly variable and subject to significant technical and interpretive artifact, a situation that may not be possible to correct before molecular testing gains the upper hand.

Dr. Borowitz of Johns Hopkins agreed with Dr. Wood’s comments and added the following: “Cost should not be an issue,” he said, “at least not in the sense of duplicating cost, as all our studies are grant funded. One could make the case that it is not appropriate for local institutions to charge patients in addition when the reference laboratories are generating the information needed for clinical decisionmaking per protocol.…”

Dr. Borowitz continued: “As to local institutions not detecting MRD when we do at the reference labs, I don’t find it to be that common among what I would consider good labs that understand what MRD testing is about. There are quantitative differences, to be sure, and these likely relate to the things Dr. Wood talked about, as well as the additional factor of shipping. However, I have also seen anecdotal negative cases from labs that perhaps don’t have much experience where the dot plots suggest to me that they would not have had the capability to detect abnormal populations much below one percent or so.”

He added a valuable warning: “Basically, MRD is a different test from leukemia phenotyping analysis and should probably be treated as such. At some point, it should probably be given its own test code, and someone should develop a separate proficiency testing procedure.”

These experts raise valid issues that warrant further discussion, perhaps in the context of the coming debate of who should appropriately regulate laboratory-developed tests, as even leukemia phenotyping performed in nearly every diagnostic flow cytometry laboratory is clearly an LDT. MRD detection is, by anyone’s definition, a high-complexity LDT and challenges the CAP and Food and Drug Administration to understand how best to regulate the practice to ensure quality testing.

In summary, there is no defined consensus for acceptable discrepancies among the methods of MRD detection, but such discrepancies are worthy of note in a comment because an indication of measurement uncertainty might influence a clinical or therapeutic decision. The expectation for some level of discordance given the above-noted technical variables and the evolving implications for patient management speak to the advantages of a consolidated or integrated hematopathology interpretive report, as distinct from several reports coming from the surgical pathology, molecular pathology, flow cytometry, and cytogenetics labs and expecting the treating physician to integrate the information into the correct disease status or subclassification. Those labs that receive reports from clinical trial reference laboratories, such as the COG, might also benefit from this being a form of proficiency testing for their own laboratories. However, if clinical decisions are based on the outside result, the appropriateness of performing and billing for the internal study might be questionable.

Suggested Reading

1. Wertheim GB, Bagg A. Minimal residual disease testing to predict relapse following transplant for AML and high-grade myelodysplastic syndromes. Expert Rev Mol Diagn. 2011;11:361–366.
2. Walter RB, Gooley TA, Wood BL, et al. Impact of pretransplantation minimal residual disease, as detected by multiparametric flow cytometry, on outcome of myeloablative hematopoietic cell transplantation for acute myeloid leukemia. J Clin Oncol. 2011;29:1190–1197.
3. Thörn I, Forestier E, Botling J, et al. Minimal residual disease assessment in childhood acute lymphoblastic leukaemia: a Swedish multi-centre study comparing real-time polymerase chain reaction and multicolour flow cytometry. Br J Haematol. 2011;152:743–753.
4. Tembhare P, Yuan CM, Xi L, et al. Flow cytometric immunophenotypic assessment of T-cell clonality by Vβ repertoire analysis: detection of T-cell clonality at diagnosis and monitoring of minimal residual disease following therapy. Am J Clin Pathol. 2011;135:890–900.
5. Coustan-Smith E, Song G, Clark C, et al. New markers for minimal residual disease detection in acute lymphoblastic leukemia. Blood. 2011;117:6267–6276.
6. Uhrmacher S, Erdfelder F, Kreuzer KA. Flow cytometry and polymerase chain reaction-based analyses of minimal residual disease in chronic lymphocytic leukemia. Adv Hematol. 2010;Sep 20.
7. Stow P, Key L, Chen X, et al. Clinical significance of low levels of minimal residual disease at the end of remission induction therapy in childhood acute lymphoblastic leukemia. Blood. 2010;115:4657–4663.
8. Stahl T, Badbaran A, Kröger N, et al. Minimal residual disease diagnostics in patients with acute myeloid leukemia in the post-transplant period: comparison of peripheral blood and bone marrow analysis. Leuk Lymphoma. 2010;51:1837–1843.
9. Fang M, Storer B, Wood B, et al. Prognostic impact of discordant results from cytogenetics and flow cytometry in patients with acute myeloid leukemia undergoing hematopoietic cell transplantation. Cancer. 2011;Sep 16.[Epub ahead of print].

Bruce H. Davis, MD
Treasurer, International Council for
Standardization in Haematology
President, Trillium Diagnostics
Bangor, Me.

Member, CAP Hematology/Clinical
Microscopy Resource Committee

Q. Can you guide me on the steps needed to set up image analysis for quantification of estrogen receptor (ER), progesterone receptor (PR), and HER2/neu immunohistochemistry?

A. Quantification of immunostaining with antibodies directed against ER, PR, and HER2 by image analysis has been used increasingly in laboratories to improve accuracy and consistency of measurements. Approximately 25 percent of laboratories that perform ER, PR, and HER2, who participate in CAP proficiency testing, are using image analysis for reporting these markers. For billing purposes, Medicare has allowed the use of the 88361 CPT code for automated image analysis in contrast to the manual quantitation code 88360.

With advances in digital pathology, quantitative image analysis has become more easily accessible to pathologists and can be incorporated into their daily workload. There are a few companies that offer commercially available digital pathology and quantification of immunohistochemistry systems (“Parade of hopefuls in digital pathology,” CAP TODAY, February 2011).

The two most important characteristics of a new system that need to be evaluated are the speed of digitization and the image quality. Once a system has been selected and installed, clinical validation must be performed. While there are no defined criteria for the validation, the following are suggestions based on our experience. At least 25 positive cases of different intensity and percentage for each immunostain that the pathologist previously evaluated manually should be run in parallel using the image-analysis system. The results of both methods should concur roughly, with a difference in percentage not to exceed 10 to 15 percent. It is essential to understand that quantitation of immunostains by image analysis is done through evaluation of eight to 15 representative areas and therefore some discrepancy between manual and automated quantitation can and often does occur because of staining variability and subjectivity of the operator in choosing the representative areas. In addition, at least 10 negative cases should also be run in parallel, with complete concordance.

Once the system has been validated, the steps to ensure quality include the following:

Daily calibration of the system through a calibrator (control) slide provided by the company.
Daily evaluation of known positive and negative controls to ensure reproducibility.

Image analysis has its limitations. The instrument is not able to differentiate between benign and malignant cells. Therefore, it is important that the pathologist be directly involved in selecting the area to be analyzed. The ideal area for study should contain mostly tumor cells with minimal benign tissue and inflammatory cells and without necrosis and hemorrhage. In addition, the instrument might also identify nonspecific brown staining and report it as positive. With nuclear stains such as ER and PR, cytoplasmic staining can sometimes be misinterpreted by the instrument as positive. Therefore, it is also crucial that the pathologist review the final results of the instrument’s reading and correlate them manually with the immunostains to avoid false-positive reading by the instrument.

As the technology of digital pathology and image analysis evolves, some of these limitations may no longer be of importance. However, with current technology the pathologist must continue to play a pivotal role in automated quantitation of immunostains.

Randa Alsabeh, MD
Director, Immunopathology Laboratory
Director, Hematopathology Fellowship
Associate Professor, Cedars-Sinai Medical Center
Associate Clinical Professor
University of California, Los Angeles

Member, CAP Immunohistochemistry Committee

Dr. Kiechle is medical director of clinical pathology, Memorial Healthcare, Hollywood, Fla. Use the reader service card to submit your inquiries, or address them to Sherrie Rice, CAP TODAY, 325 Waukegan Road, Northfield, IL 60093; srice@cap.org.