Mile Sikic from the Genome Institute of Singapore spoke at London Calling 2024. The session’s title was “HERRO: haplotype-aware error correction of ultra-long nanopore reads.” Sikic noted that HERRO can be used for any type of Nanopore reads. They developed Racon seven years ago. Now, there are phased genomes, and the challenge is to maintain this information. The standard approach to error correction is to map reads to correct positions based on a window. HERRO framework starts with read processing (trimming, filtering, and removing adapters). It then performs all-vs-all alignment with Minimap2. Finally, an AI model is used to recognize important/informative positions to avoid over-correction. The requirement is to use reads above 10 kb, noted Sikic. For each position in a read, HERRO aligns and attempts to find informative positions with, for example, two different variants. For all other positions, the majority rule is used. The program was trained on five chromosomes. Sikic shared results before and after error correction. HERRO can be used for assembly along with Verkko because Verkko resolved x and y chromosomes. Using well-characterized human, D. rerio, and Arabidopsis genomes, Sikic tested the performance of HERRO. Sikic noted that sometimes there are corrections, some of which are in challenging homopolymer regions. Sikic shared that HERRO is available as a pre-print. They emphasized the importance of reference genomes and their impact on HERRO performance. I admire the designers of these complex and powerful tools!
