Making a valid point about HPV
tests

cap today

September 2005
Cover Story

Karen Titus

Testing for human papillomavirus was bound to become big, and it has. Supported by data from the National Cancer Institutes ALTS trial, HPV testing now has a substantial role in managing women with ASCUS and as a screening tool for cervical cancer.

The first boost came from consensus guidelines from the American Society for Colposcopy and Cervical Pathology (Wright TC Jr, et al. JAMA. 2002;287:2120-2129), followed by interim guidance from the NCI, ASCCP, and the American Cancer Society (Wright TC Jr, et al. Obstet Gynecol. 2004;103:304-309). The latter recommended coupling cervical cytology with HPV DNA testing to screen women age 30 and older; those with negative results on both tests should be rescreened every three years—a significant departure from previously accepted practice. Little wonder that the cover of the August 2003 issue of Archives of Pathology & Laboratory Medicine floated the question, "Are you ready for a new era in cervical cancer screening?"

That era has begun—but it’s unfolding in a way that many find disturbing.

This story begins with a phone call. In mid-March, Mark Schiffman, MD, MPH, called CAP TODAY’s editor to voice a troubling concern: that laboratories are failing to clinically validate their HPV tests.

Dr. Schiffman heads the HPV Group in the Division of Cancer, Epidemiology, and Genetics at NCI and is a tenured senior investigator. His byline should be familiar to anyone who’s so much as glanced at the HPV literature. Dr. Schiffman’s abiding interest is HPV, and HPV alone. "I am quite narrow in my focus," he says. "I went into a specialty in order to try to speak with confidence on just a few things."

The expansion of HPV testing parameters has brought about explosive growth in tests, as might be expected. The FDA has approved one test, the Hybrid Capture 2 assay (Digene Corp.), used in conjunction with ThinPrep (Cytyc Corp.) specimens, but it’s hardly the only test out there. Ideally, this would expand options for laboratories and, one would hope, reduce costs, as healthy marketplace competition is wont to do.

But in two subsequent interviews with CAP TODAY, Dr. Schiffman says labs are stumbling badly.

His case is straightforward. Laboratories that use HPV tests need to make sure those tests are clinically validated. For labs that use the HC2, clinical validation is a non-issue; it was part and parcel of its FDA approval. Labs that use another method, however, need to do their own studies. But, he says, that doesn’t always happen.

Federal oversight of this is somewhat mushy. The FDA does not hold sway over certain homebrews, including those for HPV, although the agency has issued an opinion that HPV tests cannot be offered as ASRs. The director of the CMS Division of Laboratory Services, Judy Yost, MA, MT, says that lab directors who select tests not approved or cleared by the FDA must determine clinical usefulness by other means, such as literature searches, their own studies, or colleagues. CLIA regulations do not require that this be done. CLIA does require that the laboratory establish the test’s analytical performance specifications.

The lack of official oversight is not a free pass. "Because there’s no regulatory oversight, it is very, very important that laboratories do appropriate validation studies," says David Wilbur, MD, director of cytopathology at Massachusetts General Hospital and associate professor of pathology, Harvard Medical School.

Labs that can’t prove the clinical value of their HPV test have "no business" offering it, Dr. Schiffman says, adding, "It’s amazing to me that someone would sell a product that’s influencing a patient’s life in terms of treatment for cervical neoplasia without being sure, based on data, that they can do it again and again and again with reliability."

Tellingly, even those who might be inclined to disagree with Dr. Schiffman don’t—at least not on his major point.

David Bolick, MD, has plenty of thoughts—most of them unconventional and bluntly expressed—about HPV testing. Dr. Bolick, director of the GYN Institute, a laboratory within AmeriPath focusing on gynecologic pathology, argues strenuously that there is room for labs to use non-FDA approved methods. But his bottom line tracks Dr. Schiffman’s: For labs that use a non-FDA-approved method, he says, validation studies are absolutely necessary.

Ronald McGlennen, MD, president, medical director, and founder of Access Genetics, also says there’s room for non-FDA-approved tests—a point driven home by the fact that his company offers a PCR-based HPV method to its client labs. But he too stresses the need for appropriate clinical validation. For labs that are sending out their HPV tests, he says, "it’s absolutely reasonable to ask" the reference lab for concrete clinical validation data; indeed, he says, such data should be published. "I think you should even ask the larger question—"What are your statistics on ASCUS diagnoses?"

Data on how many labs use non-FDA-approved methods are hard to come by. R. Marshall Austin, MD, PhD, director of cytopathology at Magee-Womens Hospital of the University of Pittsburgh Medical Center, simply says, "Oh, it’s happening."

Mark Stoler, MD, says the problem "is a major concern, not ’some’ concern. It’s beyond anecdotal."

He, too, lacks hard numbers to define the extent of the problem, but he characterizes it as, "There are large commercial labs that have homebrew HPV tests, particularly PCR-based tests, and they are not validated. There’s no way they could ever validate them independently, although people disagree on this," says Dr. Stoler, professor of pathology and clinical gynecology and associate director of surgical pathology and cytopathology, University of Virginia Health System, Charlottesville.

When asked about the extent of the problem, Dr. Schiffman pauses a long time before replying, "I don’t know."

"I certainly see-in the chat areas of the different organizations, at conferences, on the Internet—advertisements and statements that are troubling, because they’re indicating an excessive faith in poorly validated assays," he says.

The calls for clinical validation stem in part from a basic, if not widely understood, issue—the difference between the test’s positive and negative predictive values.

The essential question for ASCUS triage, says Dr. Stoler, is, What is the sensitivity of the HPV test, and therefore its negative predictive value, in patients who have equivocal cytology for high-grade lesions? Many physicians, however, focus instead on the positive predictive value of the test, that is, the likelihood of finding high-grade lesions with colposcopy. The problem, he says, is colposcopy is a terrible gold standard, missing anywhere from one-third to one-half of high-grade disease.

Physicians who emphasize the test’s positive predictive value overlook its real value—if the test is negative, what is the chance that the patient has high-grade disease? Given the recommendations for lengthier screening intervals, that’s no small question.

The ALTS trial and other primary screening and triage trials using HC2 established a negative predictive value of 99-plus percent for the Digene test, Dr. Stoler says. In simple terms, that means that when a woman is told she’s negative, there’s less than a one percent chance that she has high-grade disease. "That’s coming out of patients who have low-grade cytology or ASCUS cytology, which is a very different thing than the Pap smear that already has high-grade abnormal cells—you don’t even need the HPV tests for those," says Dr. Stoler.

There’s a reason for Dr. Stoler’s elementary explanations—many physicians, and even some laboratorians, are confused by these basic matters, he says.

"Lots and lots of labs say, ’We think the PCR test is more sensitive because we can pick up fewer DNA copies.’ That has nothing to do with what we’re talking about," says Dr. Stoler.

"If you’re going to bring forward a test, you’ve got to do a clinical validation trial that establishes its performance relative to these other benchmarks," he continues. "And the standard is not analytical molecules of DNA. It’s not the analytic validation that matters, it’s the clinical validation—how does the test perform in the real world? How sensitive are you with finding high-grade disease in a population of minimally abnormal cytology patients?"

Labs that claim PCR-like supersensitivity for its tests are barking up the wrong tree. "It’s hard to make people realize this," Dr. Schiffman says. "You don’t want to detect HPV unless it’s relevant to the true disease state that we’re screening for"—risk of cervical cancer, based on finding certain levels of particular and persistent oncogenic types. It’s a bit counterintuitive, he says, noting that detecting HIV or hepatitis B and C viremia, in and of itself, is important. But that’s not true for HPV. "It’s a much more shifty kind of diagnostic target than just saying, ’I need the most sensitive detection at all costs.’"

Much of the confusion simply boils down to "the fact that the statistical distinction between analytical and clinical accuracy is not well enough understood or described by people who write or talk about it," says Attila Lörincz, PhD, chief scientific officer and senior VP of research development at Digene, who contends the problem surfaces in the HPV literature with distressing regularity.

If observers agree that clinical validation is critical, not everyone agrees on how to do it, or even if it can be done.

Dr. Bolick says validation studies—clinical and analytical—are "left up to the discretion of the medical director, and in my opinion, most medical directors have no competency in designing them." More specifically, he says, most physicians lack the depth of understanding to know what makes an HPV assay tick.

HPV consists of more than a hundred different viral types, explains Dr. Lörincz, some of which are oncogenic and most of which are not. "There are also a whole series of interrelated members, and you need to be able to distinguish between the ones that have been strongly linked to high-grade disease and cancer, and those that have not. And you have to be even more sophisticated with that because you have to detect them at a critical clinical viral threshold."

By virtue of his position at Digene, Dr. Lörincz might be expected to dismiss any efforts to clinically validate a test that would compete with HC2. But he doesn’t.

The ALTS trial, with its 5,000 subjects, and many other large studies with up to 150,000 women did the grunt work of validating the concept and clinical value of HPV testing. Labs wanting to use another test "don’t have to do the heavy lifting in the sense of re-proving ALTS again," says Dr. Lörincz. But, he says, they do have to prove in a broad setting, with different patients and different physicians, that their test will perform at a sufficiently accurate level and provide predictable results. That can’t be done on "just a couple hundred patients here and there, or by taking 50 samples and doing a test, or doing analytical sensitivity and saying, ’Hey, look, we can detect X copies.’"

By and large, what he’s seen other labs do falls short of the mark. "We spent tens of millions of dollars validating this test," he says. "For someone to come along and run 70 or 80 patients verges on the insult to everybody." Dr. Lörincz suggests it might cost a lab anywhere from a few hundred thousand dollars to perhaps a few million dollars to validate a test sufficiently.

Those who blanch at this need to realize the HPV test is being applied widely, he says, amplifying any test deficiencies across a huge number of women. "You can’t just say that it’s a small, little esoteric homebrew and there’s nothing else available, and because it’s an orphan diagnostic, we can’t afford to spend a lot of money."

Labs claiming they don’t have the resources or ability to clinically validate their assays draw a pointed question from Dr. Schiffman. "Why are they selling the assay?" he asks.

It’s not as if these labs are trying and failing to reach the standards of ALTS, he says; rather, they’re not doing any clinical validation at all. "Sometimes you’ll see no data, not a single paper, not a single article, no internal data available on request."

Dr. Stoler puts a high burden on labs, arguing that the clinical performance of the homebrew HPV tests can’t be clinically validated without essentially doing one arm of the ALTS trial. "That’s my opinion. Not everyone agrees with me." A lab that met his standard would have to be "incredibly dedicated," he says, along with having plenty of money and a patient population with sufficient numbers of women with high-grade disease who are followed up with colposcopy and biopsy.

Dr. Wilbur reports seeing several labs’ validation studies that merely compared performance with the HC2 and another, non-FDA-approved test on the same patient at the same time. Doing a hundred of these prospectively and claiming there’s no difference statistically between performance raises a red flag, says Dr. Wilbur, because most of that prospective population is negative. "You’re really not testing the sensitivity and the specificity of the test adequately," he says. "You need to have a true positive test, not only in atypicals, but you also need a significant number of high-grade lesions and low-grade lesions in your study set so you can actually compare the performance in the lesions that you’re trying to detect."

When he and colleagues looked at validating their HPV test—they use HC2 with SurePath specimens (TriPath Imaging Inc.)—he was cognizant of other confounding factors, which he urges labs to consider.

In a patient with high-grade disease, he notes, the viral load may be lower than in a patient with a productive low-grade HPV infection, which is of little clinical concern. Labs may find it fairly easy to detect the low-grade disease with very good specificity and a particular sensitivity, but could struggle to detect the high-grade disease. Labs must verify that the sensitivity of the test is significant enough to detect such lesions, as opposed to what he calls the "no-brainers" involving high viral load.

"That to me is the most important thing," he says. "So when we did our validation study, we made sure we had enough HSILs in the population to make us comfortable that we could detect those HSILs in comparison to the concomitantly collected [FDA]-approved sampling device. We specifically went to our colposcopy clinic, where we see a lot of high-grades, and told our clinicians to send us samples for this validation in which they thought the patient was at high risk for having high-grade disease."

Of course, there’s another solution—simply use the FDA-approved method. Dr. Austin calls this the conservative choice.

Dr. Austin points to an editorial by Jack Bierig in Archives of Pathology & Laboratory Medicine (Liability and Payment Issues in the Selection of Pathology Assays 2002;126:652-657) that explores the liability and payment issues of assay selection. Choosing a non-FDA-approved method is neither illegal nor improper, both men point out—as long as there is sufficient evidence to support the choice. "It does put an extra burden on the director to establish that this is a prudent choice," says Dr. Austin.

"One of the things that the Hybrid Capture 2 test established in the ALTS trial was that it really is a very respectable, reproducible test from site to site. While it may not be the ideal test, or the perfect test, it has a lot of powerful performance characteristics that can’t be overlooked," says Dr. Austin.

Labs that don’t punch the conservative ticket have plenty of candidates and combinations to choose from. "Many, many companies are coming out with HPV diagnostics, because there’s clearly now an established market," says Dr. Stoler, whose lab uses HC2. "The market should grow. I mean, we haven’t scratched the surface for primary screening application in the U.S."

Dr. Wilbur says he and his colleagues chose to use the SurePath collection device because they prefer its morphology over that of ThinPrep’s. It was also less expensive. Finally, at the time the lab was making its decision, SurePath was the only available choice for automated screening.

PCR is a common choice. Christopher Crum, MD, director of women’s and perinatal pathology at Brigham and Women’s Hospital, Boston, says his laboratory uses a PCR test through Access Genetics in addition to HC2, "mainly because there’s no recourse in certain samples"—fixed tissue and small or expired samples—"other than to use a PCR-based assay."

PCR also offers typing of the virus, which appears to be significant in a number of followup algorithms and may make it easier for laboratories to tell the difference between persistence and reinfection. "Certainly the field is moving in this direction, and the competitors to Digene are trying to sell type-specific assays that give you more information," Dr. Crum says.

At this point, however, there’s plenty of confusion as to what each type means, how the types interact (or not), and what, exactly, it all spells in terms of follow up. (Next year the ASCCP will revisit its guidelines, says Dr. Schiffman, and much of this should get hashed out.) Dr. Crum notes another potential downside to typing: risk might be elevated regardless of type. HPV-61 is considered relatively benign, for example, but a woman with this type may nonetheless be at risk simply because she was sexually active in acquiring it.

Providing information to doctors on dozens of different HPV types when only 13 have been shown to be important is potentially a dangerous precedent, suggests Dr. Lörincz. "But yet labs sort of flex their muscles by saying, ’I can detect more HPV types than you can.’" What he’d like laboratories to do instead is talk to physicians about the utility of HPV testing and explain the weaknesses and limitations of each test method, including HC2. Otherwise, he says, he fears a free-for-all with the potential for doing more harm than good. HPV testing is well accepted for ASCUS triage, but he sees apprehension and even pushback, verging on hostility, in terms of using it as a screening test, from clinicians unwilling to accept the longer screening interval.

Most discussions about HPV testing turn into long and winding trails, and invariably, at some point, proponents of each method start kicking everyone else’s methods in the shins. That can only add to the confusion for laboratories trying to make an intelligent choice.

For every criticism launched against HC2 (and there are plenty), there is someone to counter it. For every advantage that PCR appears to carve out for itself, there are those who question said advantage.

Dr. Schiffman warns against overreacting to the problems associated with HC2, noting that no test is perfect; those who emphasize HC2’s flaws tilt the discussion in a harmful direction. "I’m not really in the mood to trash the standard," he says. "I do not want to see decades of careful research lessened in their impact by sloppy application or sloppy thinking. If a well-meaning laboratory applies an HPV test that doesn’t work right, then a beneficial technology has just been made malignant."

Even Dr. Bolick, who barely pauses for breath in between his criticisms of HC2, says that highlighting its shortcomings could lead to a worse problem: Physicians will lose confidence in HPV testing at exactly the same time more tests should be done. "There are limitations about the current FDA-approved test that people should know about. But when people venture into using tests that are not FDA approved, they really need to know what they’re doing. It’s not a casual venture," Dr. Bolick says.

One topic that gets everyone nodding in agreement again are shared concerns about ISH. "I’m not aware of any published literature yet that establishes that its sensitivity is equivalent to Hybrid Capture 2," says Dr. Austin. "In terms of peer-reviewed published data, it’s not there. And that would make me nervous."

Dr. Schiffman dismantles current ISH tests with a series of questions. "The key question with in situ hybridization is clinical sensitivity," he says. "Is it virtually always positive when disease is present? And if that’s true, where are the supportive data to prove that? Where’s the proof? What are its positive and negative predictive values related to the endpoint of screening, which is CIN 3 or cancer—not just detection of molecules of HPV in the mixture. Where’s the proof?"

If the demands and details of clinical validation weren’t overwhelming enough, there’s another matter to consider: money.

"Part of this whole problem," says Dr. Stoler—and here he sighs heavily—"is economic consideration. Homebrew tests are cheaper than FDA-approved kits."

"There’s so much money involved," Dr. Schiffman agrees.

Ask lab directors why they use ISH, for example, and they typically would cite the ability to look at HPV in the cells, ease of use, and exceptional quality control, says Dr. Bolick. "Reimbursement rates for the two different methods are disproportionate, providing a strong financial incentive to choose the ISH assay. But choosing an assay on the basis of a financial motive, especially in the absence of safety data, should be strongly discouraged."

Dr. Bolick says he meets with passive resistance when he raises the issue. "The people who are doing this tend to be refractory to that kind of information," he says. "I’ve talked to dozens and dozens of folks, and the primary concern is, ’It seems to work just fine for me, I like the way it looks. And it makes good money. What’s wrong with that?’"

Obviously, there’s plenty wrong with that if labs aren’t asking the hard questions about HPV tests and clinical validation.

Dr. Schiffman sets the tone: "What I get very tired of is the cult of expertise. I think everyone, including me of course, should be challengeable and every statement should be backed by something, not ’in my experience’ or ’I believe that.’ What someone says isn’t what’s important; I want to know why they say it."

It’s reasonable for labs to ask for clinical validation information, but it’s probably not the norm, says Dr. Austin. Clinical validation issues get buried by assumptions instead—that studies are done, that they’re done well, and that they actually validate the test. "It would probably be only the more compulsive directors who would ask for that type of information," says Dr. Austin. "Then they would have to be able to interpret the validation studies and make a judgment about whether they were adequate, which is not a small order."

On the flip side, he says, reference labs that are asked for their clinical validation data need to provide it. But that can also be dodgy. "The manufacturers will often come in and say, ’This is the validation protocol that you should do,’ which may be a very minimal protocol. And the laboratory, which is simply looking to efficiently introduce the test, is likely to accept the manufacturer’s suggested protocol.

"Now, the manufacturer obviously has an incentive to get them to do the test," he continues. "And they have all sorts of devices for distancing themselves from the process. They say, ’Well, we can’t recommend these protocols, but this is one that other labs have used successfully, and you might call them if they’re interested in this.’ And of course the laboratory will frequently then just assume that this is OK, that this basically is a de facto manufacturer recommendation."

If that sounds slipshod, it probably is. "I think this is an area that could blow up at some point," Dr. Austin predicts. "It would take only one case to make this whole issue have the lid blown off it." Dr. Stoler asks how labs will justify their use of an improperly validated test in a worse-case scenario. "How are we possibly going to defend our choice of laboratory method? Because that’s the kind of decision we’re making here—you’re basically reassuring the patient and saying, "There’s so little chance that you have high-grade disease, that you can basically not be screened for a very long interval." And if something bad happens in that interval, the patient has lost a followup, and you really don’t have a leg to stand on with the homebrew method."

It’s a risky business. Bierig’s Archives article suggests that pathologists make sure their malpractice insurance covers them if they use a non-FDA-approved device, advice that seems more than reasonable to Dr. Austin. "Although I don’t think the lawyers ought to be driving policy in these situations," he says, "I do think it’s worthwhile at least to consider what they have to say." Dr. McGlennen acknowledges that some of Access Genetics’ potential clients declined to use the PCR assay for that very reason. Their concerns may be more hypothetical than real, he says. "But it has been a reason why some laboratories have chosen not to even discuss going into any other set of methods or approaches or strategies that would involve non-FDA-approved products."

Midway through his second interview with CAP TODAY, Dr. Schiffman, who’s been patiently and scrupulously responding to every question raised by others about clinical validation, interrupts himself. "What surprises me is that this could in any way be controversial," he says. The issue is not so much controversial, of course, as it is loaded—with money and competitive claims, scientific complexity, and grave medical concerns.

And Dr. Schiffman is in a position to see and hear it all.

When companies come to him with a proposed technology, he skips right over the assay, its claims, and any proprietary information. Instead, he asks for data and their study design. If they don’t have any, "That’s it," he says in a clipped voice. "Over." "We have no financial interest here," he explains. "We work with anyone who comes to us. We work with directly competing companies, we’re working with Roche and Digene all the time, all the other companies. Ventana’s been here a lot of times. I have no preferences or favorites except the truth."

Just so long as it’s based on data.

Karen Titus is CAP TODAY contributing editor and co-managing editor.