Understanding Hybrid Assemblies in Genomics Research

Tonight I watched the introduction to day 3 of the LISA workshop. Lauren Liu from Lawrence Berkeley Nation Laboratory spoke about how genomics research can be limited by incomplete genomes. They noted that “genomes are hypotheses about what microbes are doing… but with environmental sequencing we often don’t have complete genomes.” Liu explained that assembly graphs represent possibilities in genome structures. She explained how long and short reads help resolve genome assembly issues (“hybrid assemblies“), and how long reads can aid in capturing plasmid sequences. They discussed what a “complete prokaryotic genome assembly” is and how prokaryotic genomes are circular but assembling them from metagenomic data will likely not produce circularized genomes. CheckM and machine learning algorithms are used to evaluate completeness. Liu explained that error rates of second-generation Illumina sequencing are about one in every prokaryotic gene, and these often are resolved through coverage/error correction. Oxford Nanopore Technologies (ONT) sequencing has higher error rates. However, error rates are decreasing with updates to base calling algorithms, training datasets, and pore improvements. Liu explained polishing assemblies with short reads. There are two approaches to hybrid assemblies: long read assembly first and then polish with short reads or short reads and then try to use long reads to find paths in the assembly graph. Liu said that now that long read sequencing costs have decreased, more people are doing long read sequencing and assembly first followed by polishing with short reads. Liu explained that the workflow for prokaryotic genome assembly is typically sequencing, basecalling, assembly, error correction/polishing.

What options are available with prokaryotic genome assembly? AI-generated image.

Post Categories

Credits

Website images were purchased from and edited in Canva.com. Blog post images are from the WordPress free image library powered by Pexels. Gallery images used were taken or created by Carlos C. Goller or otherwise attribution is stated. Blog posts represent my reflections and reference relevant sources of information, including conferences, podcasts, books, and workshops when applicable. I strive for proper attribution of sources and accessibility of content. I am still early in the journey. I appreciate feedback!

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Understanding Hybrid Assemblies in Genomics Research

Understanding Alternative Splicing with Blessy R Package

Advancements in Antisense Oligonucleotide Design

Ultra-Fast Classifiers for Pediatric Tumors: Insights from Lennart Kester