The last session of the Oxford Nanopore Technologies Nanopore Learning Metagenomics course is about additional metagenomic concepts. Tim Walker spoke about metagenomic assembly: “the computational process which seeks to reconstruct the microbial genomes within a metagenomic mixture.”The assembled reads could be genomes or plasmids. Metagenomic assembled genomes (MAGs) can be used for several analyses. Coverage depth refers to the number of reads that match to a specific position of the template, Walker explained. More reads mapping to a reference increases the overlaps and helps improve assembly confidence. Single bacterial genomes should have between 30X-100X coverage depth for analyses. For metagenomes, it is hard to predict the expected size, therefore it is necessary to estimate sequencing depth. One approach is to review the literature for what is known about similar samples. If in doubt, Walker noted, sequence as deep as your budget will allow. Binning filters metagenomic reads by certain characteristics. The goal is to classify reads for easier and more accurate analysis. This approach can be used for taxonomy, differential coverage, and genomic feature analysis. Post-assembly binning seeks to identify and assign contigs to an individual genome. Reference based or reference free tools exist. MEGA is a reference-based system that can align sequences against protein databases. MetaBAT2 calculates tetranucleotide frequencies and groups related contigs. Distinctive methylation patterns can also be detected with other tools. De-replication is a downstream analysis that identifies MAGs of the same reference from a collection of replicate datasets and selects the highest quality representative. It can be used to find the most accurate representative of taxa abundance across different variables. dRep is an example tool that Walker reviewed. This session explained some tools I use and important concepts that will help me explain their purpose to others.
