EPI2ME Workflows for ONT Data Analyses

Dilrini De Silva is a Field Applications Scientist in Bioinformatics for the EMEAI Region with Oxford Nanopore Technologies. De Silva presented a Masterclass at London Calling 2023 entitled “How to take your data analysis further.” This Masterclass focused on typical downstream analysis for several applications including genome assembly, variants, methylation, transcriptomics, and metagenome analysis. De Silva described a bioinformatics workflow as a way of transforming data through file formats. Raw reads go through quality control assessment, reads are filtered and aligned to produce a BAM file, sequences are detected, and a result file is produced. However, this process is often not unidirectional, as improvements can be made throughout the data analyses. EPI2ME and EPI2ME Labs were the tools that De Silva described. EPI2ME Labs running locally can be used to download and run code from GitHub. De Silva noted that there are production and research release software options on GitHub, and that the research releases are often undergoing changes. Sequence assembly has the goal of producing a FASTA formatted consensus sequence. De Silva explained how de novo assembly benefits from long reads. Depth of coverage and polishing steps are important considerations. Polishing with the Medaka tool is recommended by ONT if using Flye for assemblies. However, De Silva mentioned that this process may change. For microbial genomes, ONT recommends taking FASTQ data and assembling with Flye. Polishing is then performed with Medaka. This workflow is available in EPI2ME Labs. For metagenomic assembly, the process is similar with the exception that the assembler used is MetaFlye followed by polishing with Medaka to produce FASTA output files. De Silva explained that Meta Flye is an option within the Flye package. For plasmid assembly and analysis, EPI2ME has a Clone Validation workflow that generates a report, consensus FASTA, and annotated sequences. Clone validation workflow has the option for multiplexing/barcode analysis. For assemblies of large genomes, the tools used are Flye and Medaka. An example provided was the sequencing of a human genome with the Ligation Sequencing Kit v14 and the advantages of ultra-long DNA sequencing and ultra long assemblies with Pore-C data. De Silva listed options for variant calling. Workflows for variant analysis start by taking raw reads, filtering (optional) and aligning to a reference to produce a BAM file. The standard format is the Variant Call Format (VCF) with data arranged in columns, including chromosome and range. For calling and phasing SNVs, ONT recommends aligning FASTQ reads with Minimap2. Single nucleotide variants can be then called with Clair3. Phasing is done with WhatsHap. The output is a VCF file. For the analysis of structural variants, ONT recommends Sniffles2. This tool uses information about coverage. A SNV workflow is available in EPI2ME Labs. For methylation detection, MinKNOW is used with Guppy/Dorado with Remora and Minimap2. The output BAM file can be analyzed with modkit. A comprehensive human variation workflow is available on EPI2ME: wf-human-variation. This workflow can also identify short tandem repeats. De Silva emphasized that variant detection workflows can be built for other organisms to automate the process of data analysis and documentation. For transcriptomic analysis, EPI2ME has a workflow for cDNA or native RNA. The output is a comprehensive report. A reference GFF/GFF2 format reference is needed. A single-cell analysis workflow for 10X Genomics kits is also available on EPI2ME. De Silva concluded by sharing resources from the Nanopore Community website. To summarize, De Silva emphasized “starting with the end” and performing QC to ensure appropriate read length and depth of coverage. The descriptions and comparisons of workflows were useful, as I have only used a couple!

woman concentrating and examining data on a computer screen while taking notes
Which data analysis workflows are available from EPI2ME? Photo by Karolina Grabowska on Pexels.com