Nanopore Sequence Analysis and Methylation Sequencing

Bryant Catano, a Product Support Scientist with Oxford Nanopore Technologies, led a Masterclass at London Calling 2023 entitled “How to get started with data analysis.” I have watched several of the other courses. This session focused on base calling, file formats, and MinKNOW and EPI2ME. Catano also discussed methylation tools. Catano started by emphasizing that there are several user-friendly tools available to make it “easy for anyone to analyze data anywhere.” Current disrupted through DNA or RNA passing through the pore is interpreted as electrical signals. Raw POD5 signals are interpreted as sequences saved as FASTQ files. The basecalling process uses sophisticated models to interpret the raw signal. Simplex reads from one strand of dsDNA produce Q20+ qualities, and duplex reads produce Q30+. POD5 is the new format for raw data and can be used in some analyses, such as some methylation workflows. POD5 was developed to increase read/write performance compared to the legacy FAST5 format. FASTQ format is the “de facto format for most sequence data types.” A typical bioinformatics workflow, noted Catano, typically starts with raw data and ends with an answer. This process often requires QC assessment, filtering of reads, alignment, detection, result files, filtering, and analysis of results. Catano described the different options for nanopore data analysis including MinKNOW, EPI2ME, and EPI2ME Labs. MinKNOW is capable of basecalling reads live, with three different basecalling models: fast, high accuracy (HAC), and super accuracy (SUP). As part of live basecalling in MinKNOW, base modification models can be used with Remora running in parallel. Catano mentioned that MinKNOW can be monitored remotely with the mobile app… if the devices are on the same network. This may explain why it hasn’t worked for me! The three options for EPI2ME are: on the cloud, local or using tools from GitHub. EPI2ME is able to obtain reads as they are produced, upload them to the cloud, and analyzed. The EPI2ME Agent monitors a folder for new reads and uploads them to the cloud for analysis using one of several workflows available. Catano then explained how methylation can be detected by PCR-free nanopore sequencing. In MinKNOW, 5mC and 5hmC can be detected. The underlying tool, Remora, can be run manually. IGV and JBrowse can be used to visualize BAM files and methylation differences. Haplotypes, chromosomal portions from one of each parents, can be identified and separated with Nanopore technologies. Catano noted that regions of interest can be sequenced with adaptive sampling. A BED file with targets to enrich for can be loaded into MinKNOW for adaptive sampling. Catano shared several links to learn about additional resources. In this session, Catano emphasized how there are user-friendly resources to analyze data from Nanopore devices.

man in white dress shirt using laptop and surrounded by data
Photo by Tima Miroshnichenko on Pexels.com