Tonight I watched Katrina Kalantar present a ten-minute session at the Nanopore Community Meeting 2022 entitled “CZ ID: an open-source cloud-based pipeline and analysis service for metagenomic pathogen detection and monitoring.” Kalantar is a computational biologist at the Chan Zuckenberg Initiative (CZI). I did not know about this pipeline! CZI is open-source and enables simple data upload for cloud-based processing. Kalantar also mentioned it provides sample reports of microbial abundances and multiple sample analysis. The platform also allows for downloading reports for offline analyses. It accepts both RNA and DNA sequencing data. a feature I haven’t seen too often! Kalantar explained that the CZ ID has been developing a pipeline for Oxford Nanopore Technologies sequence data. The pipeline consists of host filtering and QC, assembly-based alignment, and taxonomic reporting. For the quality control step, the pipeline filters with fastp based on quality and host reads can be filtered with minimap2. They included subsampling to 100,000 reads. I wonder if this parameter is adjustable? For assembly, CZ ID uses metaFlye and then aligns reads to contigs with minimap2. Contigs and non-contig reads are then aligned to NCBI NT, again using minimap2. Finally, contigs are aligned to NCBI NR also using DIAMOND. For taxonomic reporting, the tally of hits and aggregate total bases per species are indicated. Kalantar explained that the workflows run automatically in the cloud using S3 (Amazon). A sample CZ ID sample report was shown. The CZ ID team has been benchmarking with spiked-in organisms including Human Coronavirus OC43, Zymobiomics reference benchmark, and finally an orthogonally-characterized mosquito samples. Kalantar shared results that included the single mosquito metatranscriptomics dataset… from five mosquitoes that were evaluated. The pipeline was able to detect all expected viruses. The CZ ID team continues to validate and improve the pipeline. As of the recording date, the pipeline was only available in beta mode. I checked out the website and requested access. There is a great video about the pipeline and initiative that is about 3 min long.
