With the increasing complexity of genomic tests, clinical labs are under pressure to expand their bioinformatics capabilities and pipelines while scaling their operations for increasing testing volume and fast turnaround times. The variant interpretation process alone requires a robust ability to query numerous disparate and dynamic databases, including the literature, clinical trials, drug labels, and prior cases. Advanced clinical genomics labs must also have sophisticated capabilities to establish and maintain their own proprietary knowledgebase of variants and annotations that accrue over time from their historical testing volume. This knowledge becomes critical for interpretation and reporting of test results according to guidelines from professional societies such as the American College of Medical Genetics and Genomics (ACMG).
Lab informatics used to be merely a convenient way to track samples and manage results. In the genome era, however, this function is an essential part of expanding test menus, scaling throughput, and interpreting data. This article will explore the informatics challenges associated with genetic testing with an eye toward how they will be overcome in the future.
Challenges and opportunities
Gathering information. Analyzing results from gene panel, exome, or genome tests involves extensive searching and curation to collect as much clinically pertinent information as possible for each variant discovered. This is a time-consuming and tedious process, requiring clinical geneticists and variant scientists to query peer-reviewed literature sources, model downstream effects of variants on disease causation, and search public databases such as ClinVar and others.1 Key areas of inefficiency include the many different searches and queries that must be performed on disparate sources, the often incomplete information they contain, and the access challenges of content behind paywalls.2 Ultimately, this scattered information could be accessed through APIs so that one query could gather most or all relevant data, much as Google’s search engine made it possible to find answers from anywhere on the previously unruly internet. For now, however, clinical lab professionals must slog through with suboptimal tools.3
Internal knowledgebase. A significant competitive advantage for any lab is its history of running a certain test and interpreting its results. Labs that routinely test for, say, Lynch syndrome have deep expertise that allows analysts to spot important variants quickly whenever this test is performed. Many labs have found that developing proprietary databases of their own variant interpretations can accelerate the analysis and reporting of future tests. These knowledgebases, as they’re known, provide a solid foundation for an individual lab that helps streamline interpretation, allowing geneticists to process more tests results—and do so more reliably—in a given period of time. However, for all the benefit they bring, these data repositories add another layer of querying to the interpretation process, and without deep bioinformatics expertise it can be very difficult for labs to integrate them into traditional workflows. The most successful labs will be those that not only build out an extensive knowledgebase but also incorporate it seamlessly into the pipeline for a truly streamlined analysis process.
External guidelines. Professional organizations such as ACMG have contributed significantly to the variant analysis field with guidelines about which genes should be tested, which variants to report, and more.4 In addition to being useful information, these guidelines help set standards so analysts across labs can feel confident in the results they return and physicians can be assured that lab reports meet consistent criteria. But the guidelines also introduce their own informatics challenge, requiring geneticists to calculate results from a range of unstructured, disparate data sources for rules that may not have an easy-to-follow formula. For optimal and reproducible results, lab professionals will need tools that can automatically compute results according to ACMG and other guidelines.
Variants of unknown significance. After clinical laboratorians interpret variants with all these sources—peer-reviewed literature, public databases, an internal knowledgebase, and recommended guidelines—they often experience the frustration of having to accept that many variants will still resist classification into clinically useful categories. Variants of unknown significance are a fact of life for lab professionals and will remain so for the foreseeable future. They represent one of the toughest informatics challenges, because it may take sophisticated pattern-recognition tools or other advanced algorithms to finally pin down these variants, and these are the tools that are hardest to implement by people without deep computational expertise. It is hoped that the new spate of massive population studies around the world will eventually shed light on many of these ambiguous variants.
The interpretation bake-off
Recently, a study was conducted to determine the importance of using multiple sources of information for interpreting variants. Nearly 280 variants were randomly selected and then interpreted in two ways: once using only public databases, and again using the same databases plus a curated repository of peer-reviewed literature.5
Results showed that with the additional information sources, more variants were classified into clinically meaningful groups (pathogenic/likely pathogenic or benign/likely benign) according to ACMG guidelines. For the 180 variants analyzed that were associated with Lynch syndrome, the number of variants reported to have unknown significance dropped by 27 percent with the addition of primary literature in the interpretation process. Similarly, the analysis of 99 variants associated with heart diseases showed a 33 percent reduction in the number of variants that fell into the unknown significance category.
The study demonstrated not only the need to use multiple sources for interpretation, but also the challenge of analyzing variants without peer-reviewed literature, which may not always be accessible to clinical labs due to subscription requirements. Extrapolating from this, it makes sense that variant interpretation will get even better with the integration of more data sources and advanced curation, but this will require an informatics-powered workflow that makes the process much faster and simpler than it is now.
Going forward
Finding ways to streamline variant interpretation through bioinformatics represents a major opportunity to improve the clinical decision-making process. As technologies get better and move us toward an era of automated interpretation for all but the most complex variants, clinical labs will be able to launch new tests more quickly, scale operations with increasing demand, and support more physicians. The most successful solutions will be those that supply the bioinformatics advances without requiring user expertise, enabling clinical lab professionals to focus their skill on testing and reporting to help deliver the best
possible patient care.
REFERENCES
- Hadjisavis M, Felciano R. Soaring demand for genetic testing highlights need for streamlined data interpretation. MLO. 2016;48(2):40-41.
- GenomeWeb. In tackling the VUS challenge, are public databases the solution or a liability for labs? 2014. https://www.genomeweb.com/clinical-genomics/tackling-vus-challenge-are-public-databases-solution-or-liability-labs.
- Huser V, Sincan M, Cimino JJ. Developing genomic knowledge bases and databases to support clinical management: current perspectives. Pharmgenomics Pers Med. 2014;7:275–283.
- Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine. Advanced online publication. Genetics in Medicine. https://www.acmg.net/docs/standards_guidelines_for_the_interpretation_of_sequence_variants.pdf.
- Bioinformatics for Clinical Oncology Testing. QIAGEN. http://pages.ingenuity.com/rs/202-RSH-885/images/Clinical_Oncology_Testing%20Brochure.pdf.
Sohela Shah, PhD, serves as the principal scientist in the clinical program at QIAGEN Bioinformatics, where she works with clinical laboratories to implement the QIAGEN Advanced Testing Solution for Clinical Genome and Exome Sequencing and QIAGEN Clinical Insight automated solution for variant interpretation. She has an extensive background in genetics, genomics, and next-generation sequencing.