Bambu for Transcript Discovery

Andre Sim from the Genome Institute of Singapore (A*STAR) in Singapore spoke at the Nanopore Community Meeting 2021 about “Bambu – generating context-aware transcriptomes with Oxford Nanopore long-reads.” What a cool name! Sim explained that RNA sequencing technologies have evolved, but reference annotation for transcript analysis has remained relatively unchanged. Sim noted that fragments could be found in multiple transcripts and present challenges for assignments. Sim and team developed Bambu: a two-module transcriptomics workflow for long-read data (.bam) that uses a genome (.fa) and reference annotations (.gff) to discover transcripts, annotate, quantify transcripts, and perform context-aware isoform abundance measurements. This is an R-package that was developed. Read classes are determined and a model is created using a machine learning algorithm. Sim explained that “transcript discovery is strongly affected by sequencing depth” and that “Bambu stabilizes sample variance.” The team compared the performance of Bambu for transcript discovery along with Flair, Stringtie, and Talon for long-read data. Sim concluded that Bambu is an easy-to-use one-command tool that learns from your datasets and provides robust quantification and transcript discovery features. I wonder if we could train and use Bambu for transcript discovery?

Bamboo sticks in gray buckets
How can long-read transcriptome datasets be used for transcript discovery? Photo by Toni Cuenca on Pexels.com