NGS bioinformatics pipeline—worries and wish lists: A look at the preanalytic, analytic, and postanalytic phases

in 2014 Issues, ARTICLES, October 2014

William Check, PhD

October 2014—Last month the molecular and genomic pathology laboratory of the University of Pittsburgh Medical Center posted on the AMP listserv its requirements for a bioinformatics scientist to support next-generation sequencing for clinical testing. The requirements consisted of, but were not limited to:

PhD in bioinformatics, computational biology, computer science, or biology with significant computational experiences; MS with the right combination of background and experiences also considered.
Basic knowledge about molecular biology and genomics.
Proficiency in Python and Linux/Unix/Mac environment.
Experience in analyzing NGS sequencing data strongly preferred.
Familiarity with commonly used databases and bioinformatics tools for NGS data analysis.

It would be hard to imagine a better illustration than this posting to highlight the importance of bioinformatics to the successful execution of NGS in a clinical setting and the need for trained and experienced bioinformaticists to support clinical NGS.

Marina N. Nikiforova, MD, associate professor and director of the molecular and genomic pathology lab at UPMC, explained the stringent requirements in an email response to a CAP TODAY inquiry. “While technical issues of NGS can be handled by trained technologists, interpretation of NGS data involves highly specialized knowledge in both bioinformatics and biology. Therefore,” she wrote, “it is crucial in every NGS-based laboratory to have a bioinformatician on staff.”

Based on the experience in her laboratory, she added, having such a person is essential to building NGS pipelines and providing routine help with data interpretation and maintaining quality assurance in the NGS area.

These and other insights into bioinformatics for clinical NGS were the focus of a session at last year’s Association for Molecular Pathology annual meeting. Franklin R. Cockerill III, MD, of Mayo Clinic, who moderated, called it “intuitive” to have a session on bioinformatics for NGS at that time, which wouldn’t have been the case a few years earlier. The pace at which complex genomic analysis has entered the laboratory has outrun expectations.

One of the speakers, Federico A. Monzon, MD, then of Baylor College of Medicine and now medical director of oncology for Invitae, said, “A tsunami of genomic information is coming to us [pathologists]” from NGS. “It has taken me by surprise how fast it came to the clinical laboratory.”

Dr. Carter

Most health care institutions don’t develop their own laboratory software, another speaker, Alexis B. Carter, MD, of Emory University School of Medicine, said in a recent interview. “Because of the human resources needed to develop, test, and maintain software, most institutions prefer to purchase vendor-developed and vendor-supported software, but the analysis of NGS still requires support from trained people.” In her AMP talk, Dr. Carter, director of pathology informatics in the Department of Pathology and Laboratory Medicine and the Department of Biomedical Informatics, defined informatics as a science at the intersection of information, technology, and people.

“There are many kinds of informatics,” she tells CAP TODAY, “and NGS analysis involves both bioinformatics—information science at the molecular biology level—and clinical informatics—information science used in health care. A well-known informaticist said that informatics in general is 80 percent sociology, 10 percent information, and 10 percent technology.” This means, she says, that people are what you have to study to do informatics well.

“Managing people and implementing systems that enable humans to accurately and efficiently use computer systems to acquire, analyze, manage, and store information to improve patient care are central to informatics.”

Dr. Routbort

Mark Routbort, MD, PhD, of the University of Texas MD Anderson Cancer Center, another presenter, made a similar point in a recent interview. Three years ago, he said, his group’s typical approach to a new method was to put in a vendor system and validate it and create simple reports. “However, this approach would not scale to the complex data coming from NGS. We needed to annotate that data with clinical significance.” Dr. Routbort, director of computational and integrational pathology in the Division of Pathology and Laboratory Medicine, says the number of observable findings of known and unknown significance in NGS platforms “demands a quantum leap in terms of managing information.”

To handle this complexity, his department has one specialized bioinformatics person, and he himself has expertise in bioinformatics. “Basically, I got hooked on it when the laboratory was initially setting up clinical NGS and asked for my input,” he says. “It was quite illuminating—this is an area where pathology and informatics converge in a visceral way. To efficiently perform and report NGS testing for clinical and molecular diagnostics, you need a solid grounding in bioinformatics. You don’t have to program the pipelines, and I don’t. But I do program some downstream annotation and interpretation toolsets.”

Given the complexity of NGS and the need for expert bioinformaticists and a laboratory director who has a basic grasp of the software, is it reasonable for most laboratories to think about bringing in this technology?

“That depends a lot on your focus and what type of testing you are going to be doing,” Dr. Monzon tells CAP TODAY. Laboratories performing one or a few panels need bioinformatics resources but perhaps not full-time people. “However, you do need somebody in your department fully cognizant of the nuances of the process.” On the other end, laboratories doing many types of NGS need a larger informatics group.

There are already many off-the-shelf tools, he says, but users of those tools need to understand the complexity of the process and the algorithms that go into those tools. “When things go wrong or don’t work the way you expect, you need to know how to troubleshoot.”

At the AMP session, NGS bioinformatics was divided into the preanalytic, analytic, and postanalytic phases.

Dr. Carter defined the preanalytic phase as the right patient, the right test order, the right specimen, accessioning, aliquotting, and getting orders to instruments.

Most institutions use traditional methods of identifying the patient to whom a specimen belongs. However, some have suggested recently that DNA identification could be used to help the laboratory make sure it has the right patient. It is not rapid, with a turnaround time of 85 to 90 minutes, but its advantage over other biometrics is that it can be used to verify specimen identity when doing molecular testing. But “electronic health records have nowhere to put this data currently,” Dr. Carter said.

Getting the right order faces similar problems. In most EHRs, clinical decision support in computerized provider order-entry systems is not adequate. There is typically no place for informed genetic consent or genetic counseling, for example, or for avoiding duplicate ordering of germline genetic tests. Nor do CPOE systems help a provider choose the most appropriate test from complex tests like FISH, karyotyping, molecular, or an NGS panel.

What about getting orders to instruments? Dr. Carter introduced a survey she had taken among institutions with molecular information systems. There were 83 respondents, mostly academic and commercial reference laboratories. One question was: When your laboratory receives a specimen and logs it into your LIS, how do patient information and the test order get to the instrument that performs nucleic acid extraction? In 56 percent of responding laboratories, information is handwritten on a paper worksheet and manually typed into the instrument. In 26 percent, the barcode label on the specimen is scanned into the instrument. In 11 percent, the barcode label on a paper worksheet is scanned into the instrument, and in eight percent the LIS automatically sends the patient’s information and test order to the instrument.

“How we manage information in molecular labs right now is not great,” Dr. Carter said. “We have computer systems that we could use to tailor workflow, yet the vast majority of labs are still walking pieces of paper around. People are carrying flash drives from an instrument to the LIS or from one instrument to another for NGS analysis.”

Answers to another question—how are you recording nucleic acid quantities when measured?—revealed a similar pattern. For 58 percent of respondents, the choice selected was “We write the information down on a paper worksheet.” Only 23 percent said, “We record it in our LIS in a specific field meant for that purpose.”

“Core labs use automation lines and HL7 interfaces to transport information between pieces of equipment involved in the analysis of the specimen. In the clinical NGS lab, we are just starting to move in that direction but it is really slow getting there,” Dr. Carter says. Because of the low volume relative to chemistry or the core lab, many molecular laboratorians have not been pushing for higher automation and support for HL7 interfaces from their instrument or LIS vendors, she says.

“I’m not aware of any molecular/genomic instruments that have a true real-time HL7 interface to the LIS. Given the complexity of the technology we are using, manual transcription of data between instruments and the LIS creates a real and sustained risk of error.”

In the future, Dr. Carter would like to see positive patient identification at the biometric level for all laboratory specimens prior to analysis and reporting, real-time HL7 interfaces to communicate molecular data between instruments and the LIS, electronic orders, no manual entry, and robust clinical decision support.

Information security, too, is critical, not only for patient privacy, but for another pressing reason: Under the federal HIPAA final security rule and HITECH, unauthorized disclosure, loss, or theft of protected health information can be prosecuted. Security breaches now require mandatory reporting by the institution or provider. Institutions with breaches involving more than 500 patients are now listed on what Dr. Carter calls the HHS “Wall of Shame” (http://j.mp/breachnotificationrule). The problem is that, by law, health data privacy is the responsibility of the provider and the institution and not the vendor who sold the LIS or instrument software, Dr. Carter points out. “This gets really interesting because some of the security requirements call for the software to be built with certain features. If those features are absent, the institution or provider cannot get them added unless the vendor agrees to incorporate them. If the vendor refuses, you could be stuck.”

For example, part of the HIPAA final security rule requires that the software have an audit trail so that the laboratory knows which users have looked at or manipulated a patient’s data in any electronic system containing these data. This includes instrument sofware, ancillary programs, and the LIS. “For any data since 2006, any patient can walk into any health care site and ask to see who has looked at their data. If the data are electronic, you have to be able to give them this information. Without an audit trail, you can’t.”

In the survey, 55 percent of institutions said that at some time they had needed to know which employees added, deleted, or even just viewed a specific patient record. “Partner with your vendor to make sure you are getting what you need and that the software you are purchasing meets all of the federal requirements for health data security and privacy,” Dr. Carter advises.

With the massive amounts of data generated by sequencing, people are starting to look to cloud storage as a solution, and some cloud storage vendors are advertising themselves as HIPAA-compliant. Dr. Carter cautioned that three categories of security are required to make storage of patient health information HIPAA-compliant: administrative safeguards (policies and procedures), physical safeguards (safeguarding the hardware), and technical safeguards (what will you build into your software to keep unauthorized people from getting in? who has looked at which patient records?).

“Amazon Cloud started advertising itself as HIPAA-compliant,” and she has heard that some have started putting NGS data on the Amazon Cloud. “But Amazon only had their servers set up so that there were physical safeguards on the hardware,” she says. “To some extent, access to hardware via remote service met HIPAA rules.” But technical safeguards also require unique user identification, passwords, an audit trail, and permissions to ensure people can access only the part of the software they need to get their jobs done (so-called minimum necessary rule). “Cloud storage systems that advertise as being compliant with HIPAA may be compliant with only some of the requirements. Laboratories should verify that all security requirements are met before placing health data on the cloud.”

Dr. Routbort began his AMP talk with the impact of the tissue sample on analysis and interpretation in cancer, for which tissue is often limited. Combined with tumor purity and heterogeneity and consideration of allelic frequency, these factors must be considered in determining optimal read depth.

Pages: 1 2