In whole genome era, whole new challenges

CAP Today

June 2011
Feature Story

Innovation, digital consultation, pharmacogenetics, and national electronic pathology networks—these and other topics were the talk of Futurescape of Pathology, the fourth such CAP Foundation conference. For one presenter, pathologist Wayne W. Grody, MD, PhD, it was all about next-generation sequencing and whole genome analysis and their effect on pathology practice. Here is what Dr. Grody told those who sat before him April 17 at the InterContinental Hotel in Rosemont, Ill.

Dr. Grody is professor in the Divisions of Medical Genetics and Molecular Pathology in the Departments of Pathology and Laboratory Medicine, Pediatrics, and Human Genetics, University of California-Los Angeles School of Medicine. In March he became president of the American College of Medical Genetics, a position he called an “ultimate example of the transformation effort” and “a way to bridge the two worlds.”

I don’t like the term personalized medicine. I never use it in my own conversation. It’s demeaning in a way to the history of medicine to say we’re only now reaching personalized medicine just because we have DNA tools. The Hippocratic Oath pledges our allegiance to the patient in front of us, not to any other patients or conflicting influences. When I have my yearly checkup with my own internist, at least during the 12 minutes he’s spending with me, I’m hoping he’s thinking only of me and not about the other 40 patients he will see that day. We’ve always had personalized medicine, and we’ve always had personalized drug prescribing, and I would therefore use the personalized medicine term to mean simply molecular medicine. The latter is just one more tool to use in personalizing care. Is whole genome sequencing the ultimate in personalized medicine? You might say it is, because each of our genomes is unique unless we’re identical twins with someone, and so that would be the most personalized representation of the patient if we actually knew what to make of all the data.

I consider the difference between somatic and germline mutations to be an important distinction, biologically and, frankly, politically, because it’s where pathology and genetics have had their interface and occasional turf battles, the idea being that germline or inherited mutations are the domain of geneticists whereas somatic-acquired changes (namely, in tumors) would fall within pathology. This division is going to blur even more as we embark on whole genome analysis. If you look at the whole genome sequence of a tumor, you’ll find the somatic changes, but you’re also going to stumble as well on all kinds of inherited changes that you’ll have to deal with, whether it’s done in pathology or genetics.

The field of molecular pathology has reached at least late adolescence, if not maturity, over the past 20 years. Through all of that time, whatever the technique used, the paradigm has largely been one gene for one disease (or sometimes two genes such as BRCA1 and BRCA2, or limited gene panels), and that’s what was tested for. The techniques used were all designed to look at regions of a specific gene where a mutation might be lurking.

We’re moving now into the whole genomic era, and the new techniques theoretically are going to give us access to all genes and all diseases. Whether done by high-density microarray hybridization or whole genome sequencing, the notion of accessing all genes/all diseases at once is a radical paradigm shift.

But there is one form of whole genome analysis that’s been in clinical use for several years now, and I order it myself when I’m seeing children in the genetics clinic, and that is array comparative genomic hybridization. It’s essentially a molecular karyotype, measuring hybridization to over a million DNA probes spaced all across the genome. It has a resolution much finer than you can see under the light microscope looking at the banded chromosomes.

Fig. 1 would be the kind of readout you’d get from it. Each of the boxes is the signal of the competition of an individual human chromosome. If your patient’s genome doesn’t have any gains or losses, at least at the resolution you’re trying to look at, everything comes out along the midline, the one-to-one ratio. But if there is a deletion, such as the X-linked disorder glycerol kinase deficiency shown in Fig. 2, the patient’s DNA does not hybridize as well, and it falls below the line. Although this is a fairly large deletion, you can detect them as small as you’d like. It could be a few kilobases if you chose to get down to that level.

This technique is in use now, and as the price comes down almost to what a karyotype is, it has begun to replace the karyotype. Late last year, the American College of Medical Genetics issued a consensus statement that said array CGH should now be the first-tier test instead of a karyotype on children with congenital malformations that do not seem to fit any recognizable syndrome, or with developmental delay or autism. In our genetics clinic, we see a lot of children in those categories. We used to automatically order the karyotype; now we just skip that and go right to the microarray.

But it is very tricky, and it’s just a taste of what we have to expect when we move on to whole genome sequencing. I’m also the medical director of the lab at UCLA that does this, and I can tell you, looking at it from either end, from the orderer of the test or the lab director signing it out, there’s a lot of speculation involved. You do the test on a child with autism, and you get a deletion on chromosome 14 that has 30 genes in it. You have to look up each of those genes, what’s known about them, the rare case reports in the literature describing phenotype, and decide whether that explains your patient’s disorder. But it’s all somewhat speculative.

As an example, recently I saw a young child with autism and no obvious cause for it, so we ordered this array test and found a novel deletion. There were two genes in it: one with unknown function and the other described in two families in the literature with dominantly inherited epilepsy. So is that the gene? The child has autism, not epilepsy, but they’re both neuronal migration disorders. They could be related, I suppose, but who can say for certain? Are we confident enough in such a finding that the child’s parents could use it as a marker for prenatal diagnosis in a subsequent pregnancy? Some people might say we shouldn’t even be doing it then, until we know more. But like most things in science, if you impose a moratorium, you’ll never accrue any more knowledge. So we have to just take the plunge and make sure we have enough disclaimers in the test report that patients are protected.

The other challenge that arises with this test is the finding of polymorphic copy number variants, which are present by the tens of thousands throughout all our genomes. Many of them are catalogued and known to be benign. But many others have not been reported yet, and you can find these in virtually every case, and you have to sort those out as well.

What about when we get to actual sequencing? Sequence variants at the nucleotide level are even more common, where you have a nucleotide change, two different nucleotides present at the same position, which would be heterozygous. Roughly one of every thousand nucleotides in our genomes is polymorphic in this way, so you will see millions of them whenever you do a whole genome sequence. You’re going to have to sort through them. These variants in aggregate have been dubbed the ‘incidentalome.’ In anatomic pathology, we have the ‘incidentaloma’; it’s pretty much the same thing—an unexpected finding you wish you didn’t see but now you have it and you have to learn somehow to deal with it.

To sequence the first human genome, which was the end product of the Human Genome Project, took 13 years and more than $3 billion. If we were still at that turnaround time and cost, we certainly would not be talking about its clinical use. Why did it take so long? It was all done with the traditional method of DNA sequencing called Sanger sequencing where we use the target DNA as a template to be replicated by DNA polymerase into a ladder of fragments that are then measured on gels or capillary electrophoresis after the sequencing is done. That’s the method that has been used in molecular diagnostic labs to date; it can sequence about 200-nucleotide lengths in a run or in a day. Obviously, that’s not going to be adequate for the human genome that’s 3 billion nucleotides long. And that’s just the haploid genome—the full diploid genome in our cells is 6 billion nucleotides long.

What has opened the door? The last few years have seen the advent of so-called next-generation or massively parallel sequencing that can do a billion or more nucleotides in one run. The various next-gen sequencing platforms work in different ways, but, in general, rather than just sequencing something off the patient’s template and then measuring it by size, they capture the sequence synthesis of millions of short reads (shotgun sequencing) in real time, adding each nucleotide with different luminescent colors at the speed of the polymerase, with the instrument taking snapshots of each microsecond addition and the different colors indicating what nucleotide has been inserted at that position. That’s how these next-generation sequencing instruments can have such high throughput, though they also require tremendous ancillary computer power and memory storage to reassemble all these millions of fragments into the entire genome sequence.

And now we are starting to see the so-called next-next-generation or third-generation platforms. These instruments further increase the throughput and lower the cost by using exotic detection methods (such as micro-pH instead of light) and single-molecule sequencing. There are companies claiming that in a few years they’ll be able to sequence the entire human genome in 15 minutes for $100. Just as array CGH took over when its price approached that of karyotype, it’s going to be harder and harder to justify spending thousands of dollars to sequence one or two genes when, for the same price, you can have the entire genome, even though you’re stuck with a lot of ‘incidentalome’ data you won’t know what to do with.

Yet most diseases are still at the one- or two-gene level as far as our current level of understanding. Is there any disease state for which I would want to know the status of 1,000 genes or 10,000 genes, let alone the entire genome? Maybe some day in the future, but we’re not there now.

The few laboratories that have started offering next-generation sequencing clinically are using it for circumscribed panels of related genes in a disease family, not the whole genome. Hypertrophic cardiomyopathy is an example. There are about 20 different genes that all cause the same phenotype. The echocardiogram looks the same, so there’s no way to tell clinically which gene is involved. It would be too much sequencing by Sanger to do all 20 of them the old-fashioned way, but with next-generation sequencing, you can do it. And the same is true with dilated cardiomyopathy, the conduction defects, long QT syndrome, retinitis pigmentosa, albinism, mental retardation, hearing loss, and any of the other disorders caused by a multitude of different genes. This seems to be the first practical use of next-generation sequencing for clinical purposes.

We are just starting to see this technology applied to neoplastic disorders. One theoretical advantage of targeting tumors as opposed to germline changes in hereditary disorders is that all the millions of variants, polymorphic or not, could cancel each other out in cancer because you can compare the whole genome sequence of the tumor with the whole genome sequence of DNA from blood cells (as long as it’s not a leukemia) or some other nonmalignant tissue, and then theoretically only the variants that are unique to the tumor might be the important ones. However, we should be careful not to underestimate the challenges in this setting as well. We all know that tumors have many DNA replication and repair defects causing all kinds of secondary mutations. Somehow we are going to have to sort through them and separate cause from effect.

It’s possible to limit analysis to the exon (the coding regions), and that makes sense, in my view. At least you get rid of the majority of variants in the human genome by not looking at the noncoding/intervening/repetitive/’junk’ sequences, so to speak. Unfortunately, you will miss some things. We’re learning more and more about epigenetic effects, about enhancer regions of genes that are many megabases away from the main gene, and we won’t see those if we look only at the coding regions. Also, the current exon-capture techniques unavoidably miss some regions, so what we call ‘whole exome’ sequencing is not really ‘whole.’ But at least it’s a way for the data to be a little more manageable.

What should whole genome or whole exome sequencing be applied to as the price continues to come down? We have blood spots on every newborn that we use for biochemical tests for PKU. We can get DNA out of them. Should we screen every newborn in this way? Or would that bring us dangerously close to the world of ‘Gattaca’?

I would not go anywhere near prenatal diagnosis at this point. The idea of a couple making an irreversible decision about pregnancy termination based on a variant whose clinical significance we’re not sure about is too scary at the moment. What about couple screening? We’re starting to look into this in our group. All couples now are offered screening for cystic fibrosis mutations, and certain genetic diseases based on ethnicity, but there are about 10,000 to 12,000 other rare recessive disorders that you wouldn’t screen for, and a couple wouldn’t know they’re each a carrier of the mutation until they have a baby with one of these unfortunate diseases. What if we did whole exome sequencing of the mother and father, and for any one gene for which each had a variant, we could then do the prenatal diagnosis, and if the baby inherited both parental variants or mutations, maybe that’s the disease of interest? In some ways this application overlaps with gene discovery, and until those variants have been correlated with a phenotype, it remains somewhat speculative, so we have a ways to go before such information becomes actionable.

What about screening the entire population? Some people think we will be doing that before long. I would be wary of it again for the ‘Gattaca’-type risks of employment discrimination, stigmatization, and other negative repercussions. In principle, genetic discrimination should now be illegal in the United States, thanks to GINA, the Genetic Information Nondiscrimination Act, which says you cannot raise insurance rates or bar someone from employment or drop them from insurance coverage because of a genetic test result. But it has yet to be tested in court.

We also have to worry about the potential for whole genome sequencing in children to reveal mutations associated with adult-onset diseases. I served as the CAP’s representative on the NIH Task Force on Genetic Testing. One of our recommendations was that genetic testing for adult-onset diseases should not be done in healthy children unless there’s an intervention you would need to do in childhood to prevent it. For example, we wouldn’t do a BRCA test on a five-year-old girl because she’s not going to get breast cancer at that age. A prophylactic bilateral mastectomy or oophorectomy wouldn’t be done on a prepubertal youngster, and we need to be cautious about doing the whole genome scan in children for the same reason.

If you do a whole genome sequence on someone, what unexpected (incidental) findings might you get? There are the many novel missense variants, and we see these even in targeted sequencing tests on well-known genes. You can also see them in unknown genes, as well as variants that look like they could be deleterious but are found in a gene whose function we don’t know, so the clinical significance is similarly up in the air. I worry most about the so-called off-target results. Let’s say we’re doing a whole genome sequence on a baby with developmental delay, trying to explain it, and we don’t find anything that explains the delay but we do discover a mutation in the BRCA1 gene. Do you report that? It’s not going to affect the baby or child, and I’ve already said we wouldn’t do such a predictive test in children; on the other hand, one could argue that such a discovery could be valuable to the extended family in other ways. So I think we are on the cusp of a real sea change in test ordering and reporting, where pathologists and geneticists will need to be the gatekeepers for appropriate use and filtering of what genomic findings are reported.

I don’t know the right answer at this point, but I suspect patients are probably going to be offered a ‘Chinese restaurant’ menu of which results they want to be informed about and which they don’t, or which ones could perhaps be saved in secure storage for the future. For the first time in the history of laboratory medicine, they’ll have to choose from among these nuances in advance. What would be the options? Receive all the genomic information? We can’t provide it on a piece of paper like other lab reports; we would need a DVD or portable hard-drive to hold it all. I don’t know what purpose it would serve other than that the person could carry it around all the time, like their list of drug allergies. They could choose to receive only the relevant or targeted information, the genes we think are related to the particular phenotype. They could receive off-target results if they’re medically actionable for that person’s age. But what about reporting a BRCA mutation finding for the benefit of relatives, even though it is of no use to the child being tested? I would have ethical concerns about that, but we do need to begin to think about it.

The problem goes beyond simply revealing off-target mutations. For many of the mutated genes we find, the disorders won’t have any prevention or treatment. And because we also obtain a sequence-based DNA fingerprint in the process, it may reveal false paternity, if we also test the parents. Or government agencies could demand the fingerprint for forensic use. The technique can also reveal long stretches of homozygosity, which would be a hint that you are dealing with a consanguineous mating, or even worse, incest—and then how do you handle the legal and medical implications of that? How are we going to manage the counseling for all these results? Genetic counselors already spend one to two hours discussing some of the complex single-gene tests. You just can’t multiply that time by 25,000 for the whole genome.

And how are we going to store all the data?

In principle, this test needs to be done only once because the results shouldn’t change, but we will need some way to keep the information from it manageable and accessible, maybe in the developing cyberspace of cloud computing. And how do you deal with the huge number of novel missense variants that have never before been seen? The CAP’s molecular pathology inspection checklist explicitly states that it is the responsibility of the laboratory director to try to figure out those variants and put in the lab report what you think they may be (are they more or less likely to be deleterious?), so you can’t just punt that to the clinician.

What will be the tipping point?

At a UCLA symposium a few years ago we asked people to imagine that the whole genome could be sequenced for $1,000. Most people think that’s when it’ll become routine. With our current platforms, it’s probably about $6,000, but the price is dropping fast. I have no doubt that the technology will make it very cheap, certainly in the $1,000 to $1,500 range. But I’m concerned about one other factor, not related to technology, that would keep the price much higher: We don’t have free access to the whole genome. Much of it is owned by private interests, biotech companies and others, who may or may not allow you to look at that part of the genome. We’re talking here about gene patents, which would come with their own royalties, and if there are hundreds or thousands of royalties, you’re never going to have a $1,000 test. There are countless disease genes that are now part of intellectual property. These patents are written very broadly; they don’t care what method you use, even if it’s one you developed in your own lab. It’s the genetic information of that gene itself that you are barred from looking at. So essentially you end up with a company that has a monopoly on the gene, and I think this would extend even to gene therapy because it owns the sequence information that would need to be used for gene replacement.

In my own lab, we or our university lawyers have received about 10 different letters over the years ordering us to cease and desist from testing. For each one, the gene has an exclusive patent-holder and they don’t want anyone else doing the test, or they’re charging such outrageous patent royalties that we couldn’t afford it anyway.

I don’t have to tell you the problems this causes us. Monopolies inevitably lead to higher prices than necessary. There’s no way to know for sure if those labs are doing a good job if there’s no peer comparison. The CAP is not going to develop proficiency tests for just one customer, so it affects quality assurance. And how are we ever going to do whole genome sequencing if we don’t have access to large parts of the genome? Do we not sequence those parts? Do we tell the computer to mask them and not give us the results? Then we can’t really tell the patient in good conscience that they’re getting a whole genome sequence.

The CAP and its sister organizations (especially ACMG and AMP) have been fighting for many years to try to get around these patents or live in some workable symbiosis with them, and we’ve always come up against a brick wall. Recently, though, there’s been encouraging news. Out of the blue, the ACLU decided to get interested in gene patents. This is highly unusual for them. The ACLU is a Bill of Rights organization, whereas patent law is in Article I of the Constitution, so they don’t normally deal with those parts. The ACLU had never before dealt with any biotech case or a patent case before, but it decided there was a freedom-of-access element to it for patients needing genetic tests. The lawsuit it sponsored is officially known as Association for Molecular Pathology et al. v. Myriad Genetics, U.S. Patent and Trademark Office, et al.

The plaintiffs consist of several organizations, and the CAP has been prominent. There are also some academic laboratory directors who had to stop doing this testing because of the patent, and some breast cancer survivors who claim they wanted a second opinion and couldn’t get it. These are the plaintiffs’ key arguments: Genes are products of nature, not inventions. It’s unconstitutional to patent a person’s individuality. Patients are prevented from seeking second opinions. Gene patents are overly broad. And Article I of the Constitution bars patenting of laws of nature, products of nature, and abstract ideas. So, for example, Einstein could not have patented E=mc², even if he had thought to do so. It’s a concept, it’s a law of nature, not an invention.

The suit was initially filed in May 2009 in the New York Southern District Federal Court. The judge of that court said he would hear the case, and after many months, he issued a ruling on March 29, 2010 that went in favor of the plaintiffs’ position that genes are products of nature and therefore ‘are deemed unpatentable subject matter.’

The defendants appealed the decision and the case is now in the Court of Appeals for the Federal Circuit. The oral arguments are being heard as we speak, and it probably doesn’t matter what that court decides because whichever side loses is going to appeal, and there’s only one stop left, and that’s the U.S. Supreme Court.

The case gets to the heart of who owns a disease, the patient or a vendor. More specifically, it deals with who has access to the sequence information contained in a genetic disease gene—and, by extension, who can access the sequence information contained in any or all disease genes and the whole genome. And on that question hinges much of the future of whole genome sequencing.