Update on Modified Base Detection and Methylation in Nanopore

Marcus Stoiber, Principal Algorithms Researcher with Oxford Nanopore Technologies, presented an update on modified bases at the Nanopore Community Meeting in Houston. They began describing how modified bases are detected from nanopore sequencing data. Remora detects modified bases on top of base calling. This is an analogy to Remora latching on to bigger fish and helping. The Remora algorithm uses neural networks to identify modified bases. The accuracy is per read per site, determined using synthetic molecules. The production models available are all-context 5mC + 5mC. Stoiber noted that the accuracy gains have improved the per-read accuracy for modified bases. The 6mA (all-context) is another production model they have with high accuracy. Stoiber noted that they have research models for 5mC+4mC (for bacterial work!). Stoiber talked about how they are working on DNA damage modified bases. Models for modified bases in RNA are also being developed. Modkit is a postprocessing tool that can be used with EPI2ME. The tool is able to show a pileup identifying modified bases. Differential methylation tools are also in development. The Modkit release will be able to create pileups that identify hemi-methylation sites. Remora 3.0 upgrades will include training set support. I am curious about the 5mC + 4mC detection and may try it with some of the work we will do this semester.

person holding paper with text
How can we train software to detect modified bases? Photo by cottonbro studio on Pexels.com