Tackling the bottleneck of Variant Interpretation
11 October 2019

The first high-quality sequence of the entire human genome was accomplished in 2003 thanks to the Human Genome Project which lasted 13 years and cost more than $2 billion. Today, thanks to Next Generation Sequencing (NGS), the state of the art sequencing technology, sequencing an entire genome requires less than 24 hours and approximately $1000. As a consequence, NGS is increasingly used in clinical settings: nowadays is it possible to analyze many more genes within the same assay and as a result, large gene panel tests (>100 genes) or whole exome sequencing are routinely available in clinical diagnostic laboratories.

               The complexity of genetic testing is not anymore in the sequencing process but shifted towards the processing and the interpretation of the data generated. Currently, the key challenges are comprehensive variant discovery and accurate variant interpretation.1

               A reasonably comprehensive variant discovery would be enabled by the development of long-molecule sequencing technologies, high-quality haplotype catalogs and pan-genome analysis methods while the correct evaluation of genomic variants would require automated and streamlined bioinformatics solutions as well as the adoption of standards and guidelines for variant interpretation.

               The interpretation of the genomic variants that characterize a patient is an intricate process usually conducted by clinical geneticists with tremendous skills and expertise in their field that weight the evidences and make a diagnosis accordingly.

               Such a process requires the integration and evaluation of multilevel data derived from a diverse set of omics resources in order to identify the variant(s) likely to cause a disease. The genomic variants of an individual are initially enriched with information about their effect on the structure or the function on proteins (using in silico tools like SIFT, PolyPhen-2, PhyloP or SiPhy), their frequency across populations of healthy individual (e.g., gnomAD, ESP, ExAC, dbSNP, 1000 Genomes Project) and their presence in archive that report relationships among human variations and phenotypes (eg. ClinVar). This first step, also known as annotation, can have a strong influence on the final interpretation as deficient or inappropriate annotation will lead to overlook possibly essential findings.

               Downstream of the annotation process, clinical geneticists weight and interpret the evidences and report the final results in term of genetic diagnosis. The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) have reacted to the critical demand for an updated and structured variant interpretation framework by publishing a milestone guidance document, 2 which ever since has been applied in many laboratories all over the world.

               The aim of ACMG and AMP was to create universally applicable recommendations for the classification of variants related to Mendelian diseases and identified in a wide range of genetic tests performed in clinical laboratories (genotyping, single genes, panels, exomes and genomes). The guidelines suggest a five class classification system (“pathogenic”, “likely pathogenic”, “uncertain significance”, “likely benign”, and “benign”) based on 28 rules that evaluate different type of evidences at variant level (functional, segregation, population, in silico, etc.) and represent a significant achievement in the effort to improve the cohesion of variant interpretation among laboratories.

               In many laboratories however variant annotation, curation and interpretation is still largely a manual process and geneticists usually have to juggle several computer monitors with spreadsheets, omics databases, genome browser and the ACMG/AMP guidelines document. It is a difficult, laborious, time-consuming and ultimately not scalable process. Furthermore the analysis can take up to several hours for variants of unknown significance with conflicting evidences from the literature. Unless this process is automated and streamlined it will likely become a bottleneck, as NGS becomes more mainstream.

               To tackle the challenges of variant interpretation we have built eVai (the expert Variant interpreter), a bioinformatics tool which combines all above together and makes the complicated and meticulous process of interpretation easy, much faster and accurate. eVai significantly facilitates the variant annotation process by incorporating more than 20 omics resources and reduces the time necessary for the final interpretation by accurately prioritizing the genomic variants of a patient and suggesting the possible related genetic diagnosis.

               Our system allows to focus on a smallest but exhaustive set of variants of interest and thanks to an adequate use of artificial intelligence (AI), it automatically classifies each variant according to ACMG/AMP standards. Moreover, it enables geneticist to edit the assumptions made by the system by adding external evidences.

               We strongly believe that a well-integrated combination of omics data sources, standardized variant classification and AI can allow a faster integration of genomic sequencing in clinical care as well as accelerating the pace of research and scientific discoveries in the field of clinical genetics.

1.          Lappalainen, T., Scott, A. J., Brandt, M. & Hall, I. M. Genomic Analysis in the Age of Human Genome Sequencing. Cell 177, 70–84 (2019).

2.          Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–24 (2015).