Editor: Deborah Sesok-Pizzini, MD, MBA, adjunct professor, Department of Clinical Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia.
Laboratory considerations for releasing next-generation sequencing data to patients
April 2025—Regulatory requirements in title 45, section 164.524 of the Code of Federal Regulations state that covered entities must provide patients or their designated representatives with patient health care records upon request. This is true for all laboratory testing, including such complex texting as next-generation sequencing (NGS). The protected health information that can be requested includes billing and payment records and clinic notes. Exceptions to the requirement to provide protected health information are very limited. However, questions arise with regard to complex laboratory testing, such as what information related to genomic testing should be included in the data set and what should be taken into consideration for the release and receipt of this information. In accordance with federal guidance, an organization must provide, within 30 days, a “copy of the complete test report, the full gene variant information generated by the test, as well as any other information in the designated record set concerning the test.” The American College of Medical Genetics and Genomics interprets this statement to include even raw sequencing data as part of the full gene variant information. However, which NGS files legally must be a part of the designated record has not been definitively established. The authors conducted a study to describe the laboratory implications of releasing different NGS data files and the limitations for use of the data. They reviewed pertinent literature, including title 45 of the CFR and the Department of Health and Human Services’ guidance on individuals’ rights to access their health information. The authors noted that NGS testing includes all phases of testing—preanalytic, analytic, and postanalytic. According to accreditation standards, this end-to-end testing from the wet bench portion to the data processing (bioinformatics) portion requires the validation and verification of all components. With regard to multigene panels, the steps of variant classification, interpretation, and reporting, also known as genomic sequencing procedures, are typically performed together. However, variants may be reclassified or reinterpreted separately at a later time. In undiagnosed hereditary disorders, the laboratory that performed the initial study may be asked to reanalyze the exome or genome testing based on new medical information. After careful review, the authors confirmed that from an accreditation standpoint, validation of NGS includes the wet bench and bioinformatics portions. Appropriate validation of this testing is needed to ensure quality results. The intermediate files that are generated from NGS but have not undergone full validation are often kept by the laboratory. Although patients may request these intermediate files, uses of these data are limited, and patients may not be aware that the data have not undergone full validation. The authors recommend that laboratories support patients’ decisions to obtain their health data but in the context of educating patients to understand the content and limitations of laboratory data. They emphasized that nonvalidated genomic data should not be used for clinical purposes without being confirmed using validated methods.
Moyer A, Loo E, Cadoff E, et al. Laboratory considerations for releasing next-generation sequencing data to patients. Arch Pathol Lab Med. 2025;149(2):152–158.
Correspondence: Dr. Sophia Yohe at [email protected]
A deep learning approach to predicting blood group antigens from genomic data
Deep learning techniques have revolutionized the field of biology and medicine by providing access to large data sets, including next-generation sequencing and digitized biological data. Interest in using newer deep learning techniques and applying them to all sorts of large biological data sets has grown. One area of interest is using deep learning with the ABO blood system. Forty-four blood groups that cover 354 antigens have been discovered. Testing for ABO and RhD is performed routinely due to the significance of such testing to transfusion. However, the consistency of testing for the remaining blood antigens varies by geographic location, or patients are only tested in special situations, such as for transfusions for sickle cell disease. The significant cost of performing additional antigen typing and individualized polymerase chain-reaction testing for each blood group is a barrier to widespread testing. However, having more comprehensive blood group antigen profiling can help in matching donors and recipients, especially for rare blood types. The use of deep learning techniques in large data cohorts may also enable additional correlation studies that may predict more disease associations. Typing many blood groups simultaneously will make such analysis much more cost-effective. The authors conducted a study to adapt deep neural network techniques to the prediction of blood group phenotypes based on genotypes and phenotypes available using screening array genotype platforms. They combined blood types from blood banks and imputed screening array genotypes from approximately 111,000 Danish and 1,168 Finnish blood donors. Genotype imputation is a deep learning technique that infers unknown genotypes in one person based on information from other people. The quality and certainty of this analysis is measured with an associated information score. Anyone suspected of having erroneous or mismatched genotypes was removed from the analysis. The genomic region used to train the deep learning model was narrowed down to all available variants within the genes associated with a given RBC antigen. This included validated blood type prediction models for 36 antigens in 15 blood group systems. A denoising autoencoder was used as the initial step, followed by use of a convolutional neural network blood type classifier to account for any missing genotypes. Two-thirds of the trained blood phenotype prediction models demonstrated an F1 accuracy above 99 percent. Deep learning models for antigens with low or high frequencies or complicated antigens such as RhD were the most challenging relative to achieving accuracy of more than 99 percent. In the Danish cohort, only four of the 36 antigen models (Cob, Cw, D-weak, and Kpa) did not achieve a prediction F1 accuracy above 97 percent. The Finnish cohort had the same high predictive performance model results. However, while the models trained on the Danish cohort worked in the Finnish cohort, this may not apply to genetically distant cohorts. The authors concluded that when using deep learning models and array chip genotypes, a variety of blood groups demonstrated a high degree of accuracy for predicting blood phenotypes. These results were shown in the majority of blood groups, even those with more complex genetic underpinnings. This study suggests that these deep learning techniques can help identify donors in a donor pool to narrow down the number that require confirmation testing.
Moslemi C, Saekmose S, Larsen R, et al. A deep learning approach to prediction of blood group antigens from genomic data. Transfusion. 2024;64:2179–2195.
Correspondence: Dr. Camous Moslemi at [email protected]