William Check, PhD
August 2013—In the May 20, 2013 issue of the New Yorker, Google co-founder Sergey Brin is quoted as saying of Internet security: “In big corporations people don’t understand what security people do, for the most part, and no one pays attention to them unless something goes wrong. Frankly, a lot of companies aren’t that interested in security.”
What Brin says of Internet security could also be said of quality assurance programs in laboratories: Health care systems aren’t that interested. “I feel that’s a fair comparison,” says Susan Butler-Wu, PhD, D(ABMM), assistant professor of laboratory medicine and associate director, clinical microbiology laboratory, University of Washington. All laboratories are doing quality assurance, she says, “but there are no best practices or any formulation of what goes beyond the regulations.”
To begin to address this deficiency, Amanda Harrington, PhD, D(ABMM), recently of the VA Puget Sound Health Care System and now director of microbiology at the University of Illinois at Chicago, conceived a symposium in which speakers and attendees could share their QA practices. She and Dr. Butler-Wu co-organized “Beyond the Basics: Modern Metrics for Clinical Microbiology,” which took place at the 2013 American Society for Microbiology meeting in May. “That’s what we wanted to do,” Dr. Butler-Wu says, “to go beyond minimum requirements.”

“It worked pretty well,” she adds.
Here is what the presenters—Drs. Harrington and Butler-Wu and three others—put out for discussion: when to trust the identification from an automated ID system, what effect tracking corrected microbiology reports can have, how to track data without IT support, using IT to monitor Clostridium difficile testing, and real-time monitoring and clinical impact of result reporting in microbiology.
[dropcap]D[/dropcap]r. Butler-Wu addressed the first—when to trust automated identification systems. Cumitech 32A recommendations on verification of microbial identification instruments (“Unmodified, FDA-Cleared Tests, automated, multi-analyte”) require 90 percent agreement with an existing system or reference method. “That’s still one in 10 organisms where you don’t believe the identification,” she told CAP TODAY. “What can we do about that last 10 percent?”
Certain organisms are known in the literature to be problematic for automated systems:
◆ Coagulase-negative Staphylococci misidentified as Kocuria spp.
◆ Brucella melitensis misidentified as Bergyella zoohelicum and Ochrobactrum anthropi.
◆ Salmonella or Shigella misidentified as E. coli.
“This is true for the Vitek 2 GN card, which we were using, as well as for Microscan and Phoenix,” Dr. Butler-Wu says. “There is a list of organisms they cover but we don’t have a sense of how good they are in practice.” Every lab that uses an identification system has an internal sense for some things it doesn’t fully trust, she says. “For that particular culture, is there a way we can triage so we can say what we trust or we don’t trust?”
To make the trust/no-trust lists, the laboratory took advantage of its ability to perform definitive identification by 16S ribosomal DNA (rDNA) sequencing, which the lab uses for indeterminate or suspicious calls. The focus was on a broad group that presents problems: Gram-negative rods from cystic fibrosis sputum cultures, which accounted for more than 40 percent of the isolates the laboratory was sequencing. “We were having issues in our CF patient population,” Dr. Butler-Wu explains. “It’s established in the literature that routine phenotypic identification is not that accurate for these organisms. Rather than send every one of these non-fermenting Gram-negative rods for 16S rDNA sequencing, we wanted to know if some things were really right and others, when we get an identification, were bogus.”
Dr. Butler-Wu reviewed the results from 2,012 isolates that had been sequenced in University of Washington CF patients from 2006 to 2012 and correlated them with Vitek results. She found most organisms were reliably identified. On the other hand, she showed a dozen organisms for which Vitek identification was not trustworthy. For example, “When Vitek gave us an identification of Bordetella bronchiseptica, it was only correct in two out of 15 specimens.” Excluding cultures with mixed genera, the correct rate was two of 11. Other single identification correct call rates were 0/14 for Acinetobacter haemolyticus, 11/29 for Pseudomonas fluorescens, and 7/14 for P. putida.
Dr. Butler-Wu also showed what the incorrect calls were. For instance, the isolates incorrectly called Aeromonas salmonicidia (seven calls) were actually Neisseria perflava (1/7) and Pseudomonas aeruginosa (6/7). Likewise, all three isolates called A. sobria were in fact Burkholderia cepacia complex. “Some of these calls have real clinical and management consequences, such as the patient being taken off the transplant list,” Dr. Butler-Wu says. “We have to look further with those isolates.”
About a year after the trust list was disseminated, a substantial drop occurred in the number of organisms submitted for 16S rDNA sequencing. Regarding the one-year delay, she says, “Maybe this change took a while to penetrate.” (It’s the technologists who decide whether to send for sequencing.) Dr. Butler-Wu acknowledges that her evaluation of the impact of the no-trust list on sequencing frequency is not a rigorous one.
“In our case this was a static process,” she says. “Ideally, we should be continually monitoring and putting organisms off and on the no-trust list.”
Further decrease in orders for sequencing occurred in mid-2012 when the laboratory adopted mass spectrometry for routine bacterial and yeast isolates. “For most patients mass spec has replaced Vitek,” she notes.
[dropcap]T[/dropcap]he presentation by Linoj Samuel, PhD, D(ABMM), division head of clinical microbiology, Henry Ford Health System, was “Stay on Target: What Metrics Can Tell You About Culture Complications and Your Laboratory.” At the outset, he pointed out the increase in volume microbiology laboratories are seeing. At Henry Ford, for one, four hospital laboratories have been consolidated into one core laboratory, with no proportionate increase in the number of technologists. “As lab directors and managers we end up overseeing microbiology labs for four different hospitals with different patient populations.” Many others are in the same situation. “What tools can we use to improve oversight?” he asks.
One required metric is blood culture contamination rate, a simple metric but one that might be misleading. “We need to dig deeper to see whether it accurately represents what is going on,” Dr. Samuel says. A recommended rate is less than three percent. “We found at one of our emergency departments that, even though the contamination rate was less than three percent, there was room for improvement,” he says. Nurses tended to draw multiple sets from single sticks or lines. Moreover, multiple sets were contaminated with coagulase-negative staphylococci, which the laboratory algorithm did not flag as a contaminant. “We went back and provided education to the nursing staff on the importance of blood culture collection practices,” Dr. Samuel says.
The clinical microbiology division has adopted four levels of metrics: global, process-specific, workload, and test-specific. Tracking corrected reports is a global metric. Examples are Gram stain interpretation errors, errors in organism identification, result entry errors, and specimen processing errors. “Our pathology colleagues on the anatomic side have been doing this for many years,” he notes, citing two publications from the surgical pathologists in his department (Nakhleh RE, Zarbo RJ. Arch Pathol Lab Med. 1998;122:303–309; Meier FA, et al. Adv Anat Pathol. 2011;18:406–413). Dr. Samuel also presented data from his department chair, Richard Zarbo, MD, who started the initiative for pathologists and showed, with this tracking program, a reduction in misinterpretations from 12.4 amended reports per 10,000 surgical pathology cases in 2005 to fewer than one in 2010 and 2011. “This is our challenge in medicine,” Dr. Zarbo told CAP TODAY—“how to understand what is occurring ‘in the shop’ in real-time, at the level of each piece of work.”
Corrected reports need to be tracked systematically. Says Dr. Samuel: “We do cultures on second shift and weekends. Neither I nor the manager is here when that corrected report is made. So we needed to put a process in place to capture them daily and bring them to our attention.” In microbiology the same verbiage is used on all corrected reports, so the IT staff wrote a program that identifies when the language is used. “We get a daily report broken down by technologist, so we can provide feedback the very next day.”
A systematic report with errors by category shows problems that would not otherwise be visible. For instance, in one period a technologist made a Gram stain error in each of three consecutive months. They saw a trend and provided her with support and education, Dr. Samuel says. “For the rest of the year she never showed up on the report again. When you have 50-plus people in the lab and everyone is changing shifts and benches, it is hard to pin down who is making errors unless you follow them in a long-term report.”
The number of corrected reports per month fell in 2012 even as volume rose with no significant increase in staffing. “I think direct feedback helps a lot. It helps individuals correct their own way of doing things,” he says.
When working up colonies off a plate, for example, technologists are encouraged to do a Gram stain first. “Sometimes they think they can tell the identity from colony appearance and they work it up as such,” Dr. Samuel says. “So they tend to make errors. When they see their corrected error on this report it emphasizes to do the procedure as outlined.”
Get technical staff involved, he says. “When we first rolled out this program, it was not popular. Techs felt they were good workers and this was too much like Big Brother looking over their shoulder. But it was not punitive. No one was punished because of showing up on the report. We emphasized that almost everyone will show up at some time.” It’s not made part of their annual evaluation and it’s not on their record unless the error was significant enough to jeopardize patient safety. “We had to work hard to make the point that this was not to punish anyone but to improve lab performance and reduce the number of errors.” When it’s made an individual responsibility, he says, as opposed to quoting overall laboratory figures, that gets results.
Reporting time for positive blood culture Gram stains is an important process-specific metric. Dr. Samuel’s core laboratory gets all samples from affiliated Henry Ford hospitals. He targeted two time metrics in this process: how long it took from receipt in the local laboratory to the courier run, and how long from when the culture turned positive to when it was reported. “We call positive blood culture Gram stains to the floor,” he says. “It is one of our critical values.”
In 2008 only 37 percent of positive blood culture Gram stains were reported within two hours; 95 percent was the target rate. “Our technologists worked on this for a number of months,” Dr. Samuel says. Reporting improved but didn’t reach the goal. “Technologists felt they needed additional staffing but managers and supervisors felt there were workflow issues.” De-identified reporting times for individual technologists were then posted in the main laboratory area, with each technologist knowing his or her own identity. “Without additional staffing, reporting times suddenly got better,” Dr. Samuel says. Ninety percent of positive blood culture Gram stains were reported in two hours or less. Currently, 95 percent of positive results are reported in one hour or less.
Workflow is a workload metric and parses how many specimens are processed each hour of each day and what kind of specimens they are. When a workflow tracking metric is used, technologists coming in the next day can see how much work they will have to do and what kind, such as how many plates they will need to read out. “Supervisors can use nontraditional staffing schedules to solve workflow problems,” Dr. Samuel says.
Workflow tracking showed that second-shift technologists had to do too much work in the last two hours of their shift. “They needed to be able to close out their shift sooner and not leave work for the midnight shift.” Their goal was to reduce the number of specimens processed at the end of the shift and do more work earlier. “They took it on themselves to revise their own workflow using Lean principles,” Dr. Samuel says, “so there was lower specimen throughput in the last two hours.”
As an example of a test-specific metric, Dr. Samuel uses the Quanti-Feron Gold tuberculosis assay, which Henry Ford sends to a reference laboratory. “This test is very prone to indeterminate results if not handled correctly from the point of collection,” Dr. Samuel says. “This is something that should always be tracked, since an indeterminate result delays patient care.” Indeterminate results spiked in October 2010 to almost 90 percent, with a corresponding drop in negative results to 10 percent. A problem with the handling of specimens was found, after which the rate fell to near zero. Staff turnover can cause this kind of problem; tracking can target the responsible site.
In Dr. Samuel’s laboratory, four metrics are posted on a central bulletin board at any time. “Pick a couple of metrics, follow them, and then move on when you achieve your goal,” he told the ASM attendees. And update them frequently.
[dropcap]D[/dropcap]r. Amanda Harrington had a special challenge in setting up a QA metric: lack of IT support at Seattle’s VA Puget Sound Health Care System, where she was, until her recent move to UIC, assistant director of microbiology and molecular diagnostics, Department of Pathology and Laboratory Medicine. “Most ideas are proposed by vendors,” she says. “None of those solutions has been available to us.” The software wasn’t designed for her system or resources haven’t been available to implement the solutions. “This was also true for other audience members,” she says of the symposium. “A lot of what we can retrieve depends on our IT department. We are effectively held hostage by them. We know we’re sitting on a mountain of data; we just can’t get to it sometimes.”

Looking at one parameter, number of cultures reviewed per week, revealed a periodicity that Dr. Harrington hadn’t been aware of: “We saw a spike in workflow each week at the first of the month.” Corrected Gram stains were fairly consistent, but culture review showed one week with eight.
Most errors were clerical; some were technical or procedural. Such errors are identified and corrected. One important technical error is “Critical value—stat Gram stain not called.” On average there was fewer than one such error per week, so one week with two errors of this type demanded attention. “Rapid inquiry revealed low staffing levels on evening shift due to illness, with intermittent coverage in micro.” Microbiology and the evening shift supervisory staff resolved the problem quickly.
Occasionally a systemic error cropped up and was analyzed. Two examples: “How do MDRO organisms get reported to infection control? No clear procedure. JCAHO requirement”; “TMP/SMZ is not being reported for Stenotrophomonas maltophilia.” Resolutions were proposed, and the review notes when the resolutions were complete.
In addition to fixing the problem, “This provides us with a nice quality log to show an inspector our review process and that we changed policy,” Dr. Harrington says.
Implementing best practices with regard to repeat testing for C. difficile from stool culture was particularly challenging without the option for system-based control of ordering practices. A simple paper log was devised to prevent tests from being repeated within seven days. “At our specimen processing station a printer spits out two stickers—one is put on the specimen and the other pasted onto the log,” Dr. Harrington explains. “When a specimen comes in for C. difficile culture, the technologist can look back and see if that patient has another sticker on the log within seven days.
“Using a paper-based log, we were able to reduce the number of duplicate test requests by 50 percent.” And there was a drop in the number of patients with multiple duplicate test requests, from 32 percent to seven percent. “All of that happened without any systematic communication to clinicians,” Dr. Harrington says. “We just called them and said we are rejecting this request and told them why. Just this hard stop on ordering with a paper-based log was very effective.
“Creative, small-scale solutions can make an impact,” she says, but acknowledges, “This is not a great solution for high-volume, high-throughput labs.”
[dropcap]N[/dropcap]iaz Banaei, MD, assistant professor of pathology and medicine and director of the clinical microbiology laboratory, Stanford University Medical Center, described the use of IT to restrict C. difficile testing. He notes two challenges with C. difficile testing: an inability to distinguish colonization from infection and enforcing a seven-day interval between repeat tests. “The same percentage of hospitalized patients are colonized as are test positive,” Dr. Banaei says. Distinguishing colonization from infection is done by testing only patients who have severe diarrhea, defined as three or more loose stools per day. However, labs rarely have access to clinical criteria. “I asked the audience, and pretty much no one is enforcing this restriction.”
Regarding the second challenge, maintaining a seven-day interval between tests, Dr. Banaei notes that repeat testing was recommended in 2004 when only tests with low sensitivity (less than 70 percent) were available. With the introduction of qPCR for the bacterium’s cytotoxin, which has greater than 90 percent sensitivity, repeat testing is unnecessary. Yet clinicians continue to adhere to the old guideline. Dr. Banaei presented data from his investigation of 406 tests in 293 patients at Stanford Hospitals who had one or more repeat tests after a negative PCR (Luo RF, Banaei N. J Clin Microbiol. 2010;48:3738–3741). “Repeat testing within 7 days provided new information in only 2 (0.8%) out of 266 tests, or two (1.0%) out of 197 patients,” the authors concluded. Other investigators have found a similar outcome (Aichinger E, et al. J Clin Microbiol. 2008;46:3795–3797). Dr. Banaei presented the results of the study to the staff of the gastroenterology division who agreed with the seven-day restriction.
To enforce the seven-day interval, Dr. Banaei set up alerts at the order entry and accessioning steps. “We used the hospital information system to look back for an order in the last seven days,” he says. “If there is an order, we let the clinician know that the test is not indicated.” If the clinician ignores the alert, they get one more warning. They are told the test is highly sensitive, and the data from the Stanford study are displayed. If the doctor persists in the order, an alert tells the doctor he or she is being audited and an e-mail is generated to the laboratory supervisor, who looks at the request more closely.
“We wanted to find out how well this system worked, especially after we switched to a more sensitive assay,” Dr. Banaei says. (In 2012 they adopted the Cepheid Gene Xpert, with 98 percent sensitivity.) During the 20 months after the restrictions were implemented, there was little repeat testing in the seven days after a negative test, with a sharp increase at seven days. Of the repeat tests in the first seven days, 100 percent remained negative.
Dr. Banaei is now analyzing the repeats in the second week. “Are some doctors accepting the negative and moving on,” he wonders, “which would mean we are reducing the absolute number of repeats? Or are we just delaying repeat ordering?” In any event, he concludes that IT tools can be used effectively to implement laboratory criteria for C. difficile testing.
Dr. Banaei showed how one could potentially use the hospital information system to apply clinical criteria. When qPCR for C. difficile toxin is ordered, the information system would search the electronic medical record to see if the patient has loose stool and if he or she has three or more episodes of loose stool per day. Dr. Banaei is working on implementing this program now.
[dropcap]J[/dropcap]oan-Miquel Balada-Llasat, PharmD, PhD, D(ABMM), associate director of clinical microbiology and assistant professor of clinical pathology, Ohio State University Wexner Medical Center, described real-time monitoring and the clinical impact of result reporting in microbiology after incorporating QA metrics for preanalytical, analytical, and postanalytical stages of testing. Quantitative goals were set for all criteria.
In the preanalytical stage, the metric was requisition verification errors. The goal was 99 percent correct manual orders, which was being met already.
In the analytical phase, the program targeted four metrics (goals in parentheses):
◆ Gram stain correlation with final report (≥95 percent).
◆ AFB contamination rate (less than five percent) and blood culture contamination rate (less than three percent).
◆ Proficiency testing (100 percent correct).
◆ QC remedial actions (100 percent of QC failures documented).
For the month of April 2013, actual figures for these four metrics were as follows: 100 percent, 1.2 percent and 1.9 percent, 100 percent, and 100 percent, respectively. In that month, there were five QC failures, all documented. “We go over QC failures and double check what action was taken,” Dr. Balada-Llasat says. Three failures involved low control failures for molecular testing. “QC has to be signed by the lead technologist and by the director,” he says. “We cannot report any result if QC is out of range, and no result is reported until the problem is fixed.”
Postanalytical metrics are turnaround time for molecular viral testing and M. tuberculosis tests and corrected reports. Goals for TAT are more than 95 percent within four days for viruses and more than 95 percent in two days for M. tuberculosis. Achieved values for April 2013 were 100 percent for both metrics. For corrected reports, the target is fewer than two affecting patient care, fewer than four not affecting patient care, and fewer than six total. For April the numbers were one, four, and five, respectively. Educational talks or retraining is the corrective action for corrected reports.
One type of report error is the clerical type, such as using the code FUSP, which denotes Fusobacterium sp., instead of FUSPE, which indicates Fusarium sp. A processing error would be only plating a routine throat culture on BAP. “In this case, the technologist forgot to plate a chocolate plate, so there was no coverage for Haemophilus influenza,” Dr. Balada-Llasat says.
He cites an example of a microscopic error involving a direct smear of cerebrospinal fluid, where the technologist reported no organisms seen, while Gram-negative bacillus was found on review. “When a physician orders Gram stain on CSF,” Dr. Balada-Llasat explains, “the sample may also be sent to anatomic pathology. They are looking for cells and sometimes use different stains. In this case the pathologist contacted me about cells in the CSF. That was a red flag.” Seeing neutrophils in the CSF is an alert that the patient might have an infection. “I reviewed the slide and agreed with the pathologist. It was an opportunity to make some changes.” Now, if neutrophils are seen but no organism is reported, a second technologist must confirm there are no Gram-negative or -positive organisms in the slide.
One of the more common corrected reports has been yeast preliminarily identified as Staphylococcus sp. based on morphology. “When you are dealing with immature colonies, yeast can be misidentified as staph,” Dr. Balada-Llasat says. “Our final identification is now based on mass spec and the preliminary morphology is called ‘yeast-like.’ Since we started doing Gram stain on a wet mount, that mistake has decreased.”
A Gram stain from a positive blood culture is verified in Dr. Balada-Llasat’s laboratory on the Verigene instrument, which identifies organisms by DNA hybridization and provides results in two hours. “We call the physician right away and don’t wait for confirmation by our other methods, which give results the next day.” The other methods are MALDI-TOF mass spectrometry and MicroScan. “In 99.9 percent of cases all methods match,” he says. “It is rare to have a discrepancy.” However, if the other test results don’t match those of Verigene, a corrected report is issued.
Dr. Balada-Llasat showed one case in which a positive blood culture stained as Gram-positive coccus and was identified as E. faecalis by Verigene. However, MicroScan identified the isolate as E. avium, confirmed by MS. “This revealed a weakness of the Verigene—cross-reactivity of E. avium with E. faecalis, which we have now noted,” Dr. Balada-Llasat says. In this situation there was no adverse patient effect.
Dr. Balada-Llasat started validating mass spectrometry two years ago and introduced it into the clinical laboratory more than a year ago. “It has been great for us. It’s quite impressive how sensitive it is. It has expedited identification. For bacteria, mycobacteria, yeast, and dimorphic fungi, you can use MALDI.” Right now they are only doing it on colonies from solid media and mycobacteria from liquid media and consider it their primary method for identifying organisms. However, it gives only the main organism in mixed infections. Also, clinicians and pharmacy want to know about Van A and B and Mec A genes for resistance. “MALDI can’t do that,” Dr. Balada-Llasat says. “Verigene gives us that in a couple of hours.”
[dropcap]P[/dropcap]utting MALDI-TOF MS into clinical practice will introduce new QA challenges. “Mass spectrometry is increasingly being used in micro labs for routine identification,” says Dr. Butler-Wu of the University of Washington. Cumitech document 31A, “Verification and Validation of Procedures in the Clinical Microbiology Laboratory,” governs how a laboratory qualifies an instrument like MS for clinical application. A minimum of 200 isolates is required. The document says, “Whenever possible, these isolates should include all species identifiable by the new or revised test.”
“Those criteria might work for Vitek or Phoenix,” Dr. Butler-Wu says, “but it becomes impossible for something like MALDI,” for which “everyone is sort of reinventing the wheel.” Laboratories are validating independently and using identification score thresholds. “There are no best practices out there. I was just trying to start a dialogue about this,” she says of her remarks about mass spec in the symposium.
Mass spec is used primarily now in large medical centers. But two instruments—Bruker Biotyper and Vitek MS—are before the FDA. There will be FDA-approved databases, she says. For example, more than 3,000 organisms are in the Bruker Biotyper database, though it is predicted that IVD databases will be more limited.
In her own laboratory Dr. Butler-Wu has taken what she calls a “very conservative” approach, doing species-level identification of organisms they see in clinical practice. She has looked at 2,000 organisms. “That list is now becoming bigger and bigger,” she says. “In time it will become unmanageable.”
While Dr. Butler-Wu confidently concludes that integrating “trust lists” into the routine workflow of the clinical microbiology laboratory has the potential to reduce the number of organisms that require identification by sequencing, when it comes to mass spec, laboratories are back to square one. She says, “There is no consensus on the ideal way to validate MALDI-TOF MS for identification of routine isolates.”
William Check is a writer in Ft. Lauderdale, Fla.