Searching for Structural Variants of Interest in the 1000 Genomes Project

Gus Gustafson from the University of Washington spoke at the Nanopore Community Meeting in Houston about “Understanding normal patterns of human structural variation with nanopore sequencing.” Gustafson is a second-year graduate student. They provided background, including these statistics: >50% of suspected Mendelian conditions remain undiagnosed after clinical testing, and up to 2/3 of structural variants (SVs) are not detected with short-read approaches. Long-read sequencing can span larger, often more complex regions of the genome. Gustafson’s project aimed to use long-read sequencing (LRS) to establish a catalog of common SVs in a healthy, diverse population from the 1000 Genomes Project. They have already analyzed 100 genomes. The overall statistics are >25x and >40 kb N50s! One challenge Gustafson has encountered is merging SVs across samples. They shared a graph depicting the cumulative increase in SVs detected. Gustafson created a Shiny App to browse relative SV allele frequencies in the first 100 samples. The browser-based data analyses could include studying repeats and filtering for rare, disease-causing SVs. I love how Gustafson used Big Bird, Grover, and the Count graphics to depict the number of SVs at each filtering stage. Gustafson concluded that they could use LRS samples from the 1000 Genome project to search for and catalog disease variants. I appreciate how the project uses publicly available genomes and creates open-access tools!

colored blocks with numbers
How can long-read sequencing data from the 1000 Genomes Project help identify structural variants and patterns of human structural variation? Photo by Digital Buggu on Pexels.com