With new ACOG guidelines, step wisely

June 2007
Cover Story

Karen Titus

The new ACOG guidelines for prenatal screening are, in the words of one observer, “a Chinese menu of all the possible tests.” Now, it’s time for laboratorians and their clinical colleagues, as well as patients, to start choosing.

It won’t be easy. Though many agree the new guidelines (Practice Bulletin No. 77, released by the American College of Obstetricians and Gynecologists in January) are a major advance, the menu is broad, with something for everyone. Like most guidelines, they’re written by committee, which means multiple viewpoints expressed in neutral, even-handed language—which invariably puts out the welcome mat to confusion.

The practice guidelines tell physicians what can be offered, and why. Left untouched is the “how.” That should come later, when the American College of Medical Genetics releases its technical standards and guidelines for first trimester and integrated screening. The documents may not launch a new era in prenatal screening and diagnostic testing for fetal chromosomal abnormalities, but they’ll certainly move matters several steps—though in what direction, no one yet knows.

The ACOG guidelines are notable for what they send packing as well as what they newly endorse, says Andrew MacRae, PhD, Cadham Provincial Laboratory, Manitoba Institute of Child Health, and associate professor, Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg.

In a nutshell, three standbys now stand down: Age doesn’t matter. Age alone —that is, using age of 35 or older at time of delivery—to determine fetal risk of Down syndrome is out. Now all women, regardless of age, need to be offered some form of prenatal Down screening. “There’s no magic left in the age of 35,” says Glenn Palomaki, BS, associate director, Division of Medical Screening, Department of Pathology and Laboratory Medicine, Women & Infants Hospital, Brown Medical School, Providence, RI. Though it may not change practice—the vast majority of obstetricians have likely moved beyond this historic holdover, he estimates—from a litigation standpoint, it removes that last bit of worry.

The reason for this change is simple: Newer screening methods are so good that all women, regardless of age, will have a much better opportunity to refine their risks and make a more informed decision about whether to forge ahead with invasive diagnostic procedures, says Jacob A. Canick, PhD, professor, Department of Pathology, Brown Medical School, and clinical director, Division of Prenatal and Special Testing, Department of Pathology and Laboratory Medicine, Women & Infants Hospital, Providence. “Now we’re saying, ‘Go right to prenatal screening whether you’ve decided on amniocentesis or not.’”

Here’s how Devereux N. Saller Jr., MD, MS, professor, obstetrics and gynecology and pediatrics, University of Virginia School of Medicine, Charlottesville, carefully words it: Not all patients should have screening; not all patients should be offered amniocentesis. “But it is appropriate to offer women of any age in pregnancy a test to reassure them that the pregnancy is not affected by fetal Down syndrome.”

Dr. Cnick and his laboratory colleagues at Women & Infants were already seeing plenty of screening tests submitted by patients 35 and older, though it wasn’t clear how women acted (or didn’t) on the results. “I don’t know of any data that substantiates that the women and their physicians were using the data in a logical way,” he says. The ACOG guidelines should point them in the right direction.

Ultrasound nuchal translucency, or NT, should no longer be used alone either, even though it’s probably the strongest marker of Down syndrome. Despite its relatively high detection rates as a solo marker, recent trials in the United States and the United Kingdom show it’s even better when paired with biochemical markers.
The so-called triple marker test is not enough. If three is good, four is better, and labs should now add inhibin A to their second-trimester screening panel.

While no document is perfect, “This one is really good,” says Palomaki. It may also seem a little overwhelming, which gives laboratorians and their colleagues plenty to hash out.

Like any consensus document, this one represents a variety of views. Palomaki cautions against relying on statements taken out of context. “Everyone has his own reasons for favoring one thing over another. That’s why you read one paragraph and it seems to say one thing, and then you read a paragraph later on that seems to say almost the opposite. It’s because they couldn’t agree on something—so what they agreed to do was put them both in,” Palomaki says with a laugh.

Even the discussions are open to discussion. Palomaki says he’s heard two sets of obstetricians talk about the ACOG guidelines. In one case, a prominent group of OBs, one of whom helped write the guidelines, proffered their interpretation that the guidelines stipulate that amniocentesis, diagnostic testing, or prenatal screening should be offered to every pregnant woman. More recently, he heard another author suggest that’s not quite true; rather, a careful reading of the guidelines reveals that prenatal diagnosis should be available, but only prenatal screening should be offered. “Boy, is that a fine point,” says Palomaki. “What does it mean to make amnio available, while offering screening?”

Ditto for the section on interpreting test results. The guidelines say it’s preferable to provide patients with a numerical risk rather than a positive versus negative screening result. Sounding like a Talmudic scholar, Palomaki says, “It doesn’t say ‘in place of.’ It just says it’s ‘preferable’ to give a [numerical] risk. Well, we all agree with that. But it’s also preferable to give a positive or negative interpretation. And that’s not contradicted, either.” Take the line out of context, and it’s not out of bounds to assume ACOG has put the kibosh on providing positive/negative results. Palomaki suggests otherwise, adding that he’s heard one guideline author say the intent was to stop labs from reporting only positive and negative results, that such results need to be accompanied by a numerical risk.

“It almost sounds like some people would rather not see a positive/negative result at all,” Dr. Canick agrees, “just give the result and let the physician and patient somehow decide together what that risk means and whether it should mean further testing.” He says that burden is unfair, and that labs must help both parties put risk into perspective as well as define it—a notion reinforced by the CAP checklist for prenatal screening, he adds. “We’re the experts in prenatal screening.”

Such variances don’t necessarily point to a weakness in the guidelines, but they are worth noting, simply because it reinforces the idea that these guidelines are a starting point, not an endpoint. “This moves the whole field forward, which is good,” Palomaki says.

Moreover, says Dr. MacRae, “It’s the best summary in years gathering the evidence that’s come forward lately.”

Some of that evidence is from the FASTER Trial (Malone FD, et al. N Engl J Med. 2005;353:2001–2011, with Dr. Canick as the second author), which looked at first-trimester combined screening (NT, pregnancy-associated plasma protein A [PAPP-A], and the free beta subunit of human chorionic gonadotropin at 10 weeks three days through 13 weeks six days of gestation) and second-trimester quadruple screening (alpha-fetoprotein, total hCG, unconjugated estriol, and inhibin A at 15–18 weeks gestation). The study found that while first-trimester screening at 11 weeks is better than second-trimester quad screening, by 13 weeks the two approaches were similar. Combining markers from both trimesters yields higher detection rates and lower false-positive rates than either trimester alone.

With the ACOG guidelines now reinforcing these findings (another frequently cited study is the SURUSS trial: Wald NJ, et al. J Med Screen. 2003;10: 56–104), labs and others are being introduced to the concepts of fully integrated screening and stepwise sequential screening.

The ACOG guidelines expand discussion of first- and second-trimester screens, making the clear statement that first-trimester screening is equivalent in its performance to second-trimester quad screening, says Dr. MacRae. This is based on four large studies of first-trimester screening, one of which showed a 90 percent detection rate, while the other three had detection rates ranging from 79 to 83 percent, each with a false-positive rate of five percent.

The ideal, of course, is to have accurate information sooner rather than later. But since the first-trimester screen at 12–13 weeks is not significantly better than second-trimester quad, and since combining first- and second-trimester markers has been shown to be better than either trimester alone, physicians and patients will probably migrate toward the integrated use of markers from both trimesters.

This, as it turns out, is easier done than said. Here laboratorians bump up against a problem familiar to economists, who can spend their professional lives wondering why consumers act the way they do.

If a patient wishes to receive her screening result during her first trimester, she’s effectively made a contract for a first-trimester screen. But if she’s willing to wait until the second trimester to receive her screening result, she has two options: She can have just the second-trimester biochemistry markers (quad test), or she can have some first- as well as the second-trimester tests and have her second-trimester screen based on everything known to date.

“The physician has to be very careful here,” Dr. MacRae says. “If any testing has been done in the first trimester, you cannot give an independent risk based only on the four second-trimester markers—you must combine them with the information already known from the first trimester.

“But if the patient is told her risk based on first-trimester tests, how do you counsel her based on the additional second-trimester markers? How do you undo the fact that you’ve already told the patient her risk is one in 200, and now say, ‘Well, this new risk is better—it’s now one in 2,000,’” Dr. MacRae continues. “I think almost every patient, given two risks, will act on the higher of the two,” he says.

Laboratories are getting requests to present information both ways, Palomaki says, and, as a result, are considering whether software should allow for sequential testing either with the risk revealed or not after the first trimester, depending on the ordering physician. “That would be a good trick,” Palomaki says. “It’s tough to do but it could be done.”

The ACOG guidelines cautiously introduce two types of screening, both called sequential screens. Neither has been validated clinically, and both are worth mentioning, says Dr. MacRae. They’re essentially integrated second-trimester screens with an early exit for patients who are deemed to be at very high risk (or perhaps very low risk) based on partial first-trimester markers.

The first, the step-wise sequential screen, has been endorsed by ACOG and is a variation on the integrated screen, which usually comprises an NT and PAPP-A in the first trimester and the quad test in the second. Results are given once the quad test results are in and are based on all six markers—usually.

“With the step-wise sequential screen, if your nuchal translucency and PAPP-A result are in the top one half of one percent—greater than 3.5 millimeters in measurement—that half of one percent would be tapped on the shoulder and told, “You might want to know that because of this unusual level of nuchal translucency—not to alarm you, if that’s at all possible—you may wish to access an earlier diagnostic test that has a slightly higher risk than amniocentesis: chorionic villus sampling,” says Dr. MacRae. This approach will find a large proportion of Down cases in that top cluster of one half of one percent. (The exact number depends on the cutoffs used.) All others (that is, about 99.5 percent of screened patients) will not be contacted and will go on to have the second-trimester component of the integrated test. This step-wise sequential screen allows patients to be informed when a frank abnormality is seen during an ultrasound exam, which is sometimes raised as an issue with the integrated screen.

The second approach, the contingent sequential screen, tackles the opposite end of the spectrum—patients found to have exceedingly low risk, Dr. MacRae says. “Why not tell them that too? Why have them wait for the second trimester?

“So now we would have three groups after the first-trimester component,” he explains: “1) those who are very high risk, who are tapped on the shoulder about having a CVS, 2) those who need to go on and have a second trimester test and have their risk combined on all the tests, and 3) a third group—and this would be the majority—who are at such low risk that second-trimester testing is not warranted.” ACOG has not endorsed this strategy, probably because of the high proportion of screened women who would be alerted to their intermediate status and the overall complexity of the test, say Palomaki and Dr. Canick.

“You would have to adjust your cutoffs for this, and they may be somewhat unpopular,” says Dr. MacRae. Telling a woman her risk may be one in 30, but that she’ll have to wait until the second trimester for more information, won’t make her happy. The benefit, of course, is that perhaps 80 percent of the population won’t require second-trimester followup; the 20 percent who must wait for more testing are another matter. “As my local geneticist said, ‘That’s all I need—a new screening test with a 20 percent false-positive rate,’” says Dr. MacRae, because these patients will know they weren’t at low enough risk to skip second-trimester testing.

As these concepts gain traction, the terminology is bound to fray. Dr. Saller says one point needs to be clear: Sequential or contingent screening is not simply a series of tests, one after another. “The first and second trimester tests have to be linked some way. For example, it’s important to make sure that the appropriate a priori risk is used for both screens.”

Dr. Canick says the guidelines clarify another formerly fuzzy point, though he says not everyone seems to grasp it: “If you’re going to offer first-trimester intervention, either alone or as the first part of a sequential test, then CVS has to be made available.” It’s self-defeating to tell a woman who screens positive to return for an amnio in several weeks. “You’re just placing someone at a heightened state of alert and anxiety.”

On the other hand, he says, many physicians are finding that some women prefer to wait for an amnio because they’ve heard it’s safer than CVS. That leads him to think that first-trimester screening is in a transitional period. Since the benefits of early screening are lost when a woman waits for an invasive procedure in the second trimester, “Then why not wait, finish the best test you can get, and then consider diagnostic testing?” he asks.

Ultimately, Palomaki predicts, sequential/integrated screening will surpass first-trimester screening. “It’s just such a better test. It’s probably not going to be immediately taken up, but clearly there are a lot of labs scrambling right now to put up a PAPP-A assay and learn how to interpret nuchal translucency, because they can see the writing’s on the wall.”

NT sounds like the bailiwick of sonographers, but it isn’t. Laboratorians need to convert the values given in ultrasound reports, along with the serum screening markers, and combine them into a patient-specific risk. NT measurements are provided in tenths of millimeter units; a dating measurement—the crown-rump length—is also included, which allows labs to normalize the NT measurement. Labs convert the NT measurements into a normalized unit called the multiple of the median, or MoM, with one as the central normal value. All of which means it’s up to the laboratory to make sure a QA measure is part of the equation, says Dr. Canick. The key is to establish medians for the NT measurement and to monitor their consistency over time and among sonographers.

For labs already familiar with the ins and outs of NT measurements, this isn’t a new issue. But some are worried about how this will play out if screening programs become ubiquitous, says Dr. Saller. “Right now, the laboratory I work with [Dr. Canick’s laboratory] and my team are very aware of the need for quality and coordination. But is that going to continue to be the case when other people are providing that service?”

Each operator holds the ultrasound transducer differently, each fetus moves uniquely, and so on. Initially this is the purview of ultrasound experts—radiologists or obstetrical sonographers. Labs should not be in the business of telling sonographers about caliper placement or magnification or transducer planes.

Once the training is done, says Dr. Canick, another challenge emerges. “How do the people who are now presumed proficient in doing this measurement get reproducible results that don’t vary, that stay within a tightly controlled range?” Laboratories must at least share this duty with sonographers, if not actually do the QA for them, he says.

NT is like a biochemistry test in all ways—with one sizable difference. A laboratory that runs, say, 10,000 pregnancy screens during the year is running 10,000 AFPs on its own machines. That lets supervisors run a tight ship as far as controls. Alternatively, Palomaki points out, the 10,000 accompanying NTs may be sent in by 300 sonographers. “It’s like having 300 in-house assays for nuchal translucency,” he says.

Women & Infants Hospital did the QA for NT in the FASTER Trial as well as the laboratory work. That experience, along with gleanings from the SURUSS trial, provided clues but no hard-and-fast answers. One approach is to establish normative median data for each sonographer, almost as if each instrument in the laboratory had its own reference data. The FASTER Trial took a broader approach, establishing a set of medians for each of the 15 sonography centers that submitted data.

That’s quite different from the approach currently advocated by a major sonography group, the Fetal Medicine Foundation, which uses the universal median—a single set of medians for everybody, Dr. Canick explains. “Whereas we say sonographers are different, equipment is different, and as long as they’re measuring things correctly, there will be differences from center to center or sonographer to sonographer that we can deal with, just the way we do with QA within the lab.”

The upcoming ACMG guidelines, which Palomaki has helped to write, are likely to address this topic. While declining to provide guideline details, pending review by others, he said the ACMG authors intend to recommend that monitoring NT be part of laboratory quality assurance.

This means sonographers will need to submit sufficient samples, Palomaki says. “A lot of sonographers simply can’t be assessed—they send in only six samples over a year. Well, what are you supposed to do with that?”

Frankly, this was a problem even for those writing the ACMG guidelines, and one of the reasons the document has been two years in the making, according to Palomaki. Given the paucity of data, the group had to collect NT data from a half-dozen laboratories and monitor performance retrospectively.

Again declining to delve into details, Palomaki acknowledges one problem clearly emerged: Many sonographers simply can’t be assessed due to low volume. This may not be big news, but the high rate—which he declined to pinpoint—may surprise many. In the meantime, labs will need to start thinking about how to merge NT measurements, how to interpret crown-rump lengths, how to monitor—and even how to talk to—sonographers. Palomaki’s institution runs a first-trimester external proficiency testing program for labs designed to complement the CAP second-trimester PT program (FP Survey); in addition to providing samples for traditional assays, it sends out “samples” for NT assessment. “For example, we’ll send them an Excel file with three worksheets and say, ‘Here are between 50 and 100 observations from three new sonographers who are joining your practice—tell us what you think of them.”

The issues don’t stop with NT. The absence of the fetal nasal bone is another ultrasound marker for Down syndrome. If NT is hard, nasal bone is harder (so to speak), requiring the skill set needed for NT plus a host of other unknowns. Some centers suggest it should be used; others say that day will never arrive. Dr. Saller and his colleagues take note of the nasal bone measurement but do not include it in their numeric algorithm. The ACOG guidelines offer no definitive advice, merely acknowledging that it’s ripe for discourse and that no clear conclusions at present explain how or if it will fit into the picture.

The real bread-and-butter for labs, the biochemistry markers, should give labs a breather.

Dr. MacRae says any laboratory doing Down syndrome screening or considering it should offer the second-trimester quad test at a minimum. Labs already offering the second-trimester triple test should add the fourth marker, dimeric inhibin A. “And anybody who wants to offer more than one screening modality would be adding a PAPP-A and an hCG test in the first trimester if they wanted to support first-trimester screening and/or an integrated screen.”

Most labs are already running hCG in the second trimester, Palomaki says, and since hCG in the second trimester is lower than in the first, labs should have little difficulty making adjustments. He says there’s no shortage of good, relatively inexpensive hCG assays available.

That leaves the matter of obtaining an initial set of first-trimester samples for reference ranges. Palomaki calls it a bootstrap problem, since labs may need to amass samples from other labs, then establish their own ranges. His own laboratory faced no such difficulty—it handled the samples for the FASTER Trial.

The latest recommendations appear to break a deadlock between total, or intact, hCG and the free beta subunit of hCG. Patent issues in the United States have limited access to fbhCG, a matter of concern since some studies have indicated it carries an approximate two percent detection rate benefit over total hCG. The ACOG statement, however, supports the use of either. “I wouldn’t say it establishes equivalence,” Dr. MacRae says, “but it establishes the impunity of using either in your algorithms.” He predicts that will bring significant relief in labs that have been unable to access fbhCG. “Some people have actually been adding inhibin in the first trimester—at a considerable cost—to get a little performance boost, to protect themselves from the challenge that they weren’t using the most informative marker.”

Dr. Canick says the patent issues energized researchers to take a closer look at the two hCG options, with interesting results. It appears that when used in combination with NT and PAPP-A—the two better markers in first trimester—in weeks 11 to 13, the hCG component, whether it’s fbhCG or hCG itself, is a little like the dinging triangle and cymbal crash in Bruckner’s 7th: Though noteworthy, it makes very little difference in the grand scheme of things. This is important, Dr. Canick says, because this is the period when most first-trimester screening is done and the optimal time for NT, sonographers generally agree. Neither SURUSS nor FASTER uncovered differences between the two hCGs. “They both give almost exactly the same performance, meaning 85 percent detection rate for a five percent false-positive rate. And whether there’s a one percentage point difference between them is almost irrelevant,” Dr. Canick says. “At 13 weeks hCG was in fact a little bit better, at 11 weeks fbhCG maybe was a point better, and at 12 it was a wash completely.”

Something else to keep in mind: PAPP-A poses another problem for those labs wanting to add first-trimester markers—there’s no FDA-approved assay on the market, though it does have a CPT code. That means turning to in-house assays or using research-use only kits for clinical purposes.

The ACOG guidelines make manifest one more point: Down screening has made enormous advances. One need only look at the slightly sad state of Down screening 15 to 20 years ago. “It was not good,” Palomaki says. Age alone today—essentially the state of testing two decades ago—would detect the 50 percent of Down cases born to women 35 or older. With the integrated test, the detection rate bumps up to 85–90 percent, while the false-positive rate drops from 15 percent to less than two percent.

Things should get even better in the years ahead. Allan Bombard, MD, chief medical officer at Sharp Mary Birch Hospital for Women, San Diego, expresses a hope that as screening further improves, the need for invasive testing will fade. “I did amniocentesis and CVS for more than 20 years—I love doing it,” he says. “But less-invasive methods are much more attractive to patients.” Looking at the broad spectrum of current and potential markers, including cell-free fetal DNA in maternal serum, he predicts that physicians will perform fewer invasive tests on normal fetuses. “That is really the goal of all of this.” He envisions a day when, if a patient screens positive on an integrated test, the chance of it being a true positive is one in 20. “I’m very optimistic,” Dr. Bombard says, “but I’m kind of a glass-is-half-full person.”

Until then, however, the integrated test is an exceptionally good, reliable screening test, when it’s done correctly, Palomaki says.

That’s a hefty monition. The better test is more complex, and requires labs, sonographers, clinicians, and counselors to get things right. “Everyone is going to have to get brought up to date. No one’s working in isolation anymore,” Palomaki says. But that’s a good thing. Chinese food, after all, is best shared by many.

Karen Titus is CAP TODAY contributing editor and co-managing editor.