Phenotype Insight through 1000 Genomes

After an endoscopy and very drowsy day, I watched the Knowledge Exchange session entitled “Nanopore from 100 to 1000 genomes: towards a better understanding of phenotypes. Fritz Sedlazeck and Philipp Rescheneder were the speakers. Sedlazeck has presented with Oxford Nanopore Technologies (ONT) before and is an expert in structural variation analyses. This session was recorded July 18, 2019. I appreciate how they started with the Central Dogma and included how in the early 200s, the dogma was SNPs account for most human variation. Sedlazeck spoke about the evolution of sequencing and microarray technologies to learn about genetic variations. They described single nucleotide variants (SNVs) as 1-20 bp changes in the genome, structural variants as 50 bp+ changes in the genome, phasing information as variants on the same molecule, and methylation information as changes in methylation status. Clair is a neural network program used to analyze Oxford Nanopore reads and improve SNV calling performance. Sedlazeck explained that structural variants (SVs) are larger than 50 bp genomic differences including rearrangements, gains, and losses of nucleotides. They explained that we typically have ~20,000 structural variants per individual, accounting for the largest number of bp modified. SVs have impacts on evolution, genomic disorders, regulation, and phenotypes. Rescheneder and Sedlazeck developed NGMLR and Sniffles. They were able to compare Illumina and long-read sequencing technologies (PacBio and Nanopore) to discover translocations & repeat expansions in the NA12978 reference genome. Importantly, long reads have helped identify more complex SV types. Sedlazeck then spoke about phasing: detection if genomic variations are on the same or different DNA molecule. Phasing can be used to study the impact of genes involved in metabolism of therapeutics, for example. Methylation can be studied using Nanopolish and fast5 files to examine methylation states of maternal and paternal samples. Sedlazeck and collaborators are analyzing thousands of samples to obtain the statistical power to identify SV of interest. Medhat Mahmoud is developing the Princess tool to detect all forms of variants and verify. Selazeck and team have sequenced 100 Tomato Genomes in 100 days using the PromethION and MinION. They have identified insertions that would have been missed with short reads. At the time of recording, they were developing the Paragraph tool to analyze this information. The end of the session was devoted to answering several questions. This was a fun recording to watch!

people at a concert or event
How can 1000s of genomes and long-read sequencing be helpful in learning about variation? Photo by Wendy Wei on Pexels.com