Gbases and Basecallig

Continuing with the Human genome sequencing and analysis Nanopore Learning course, I watched the session entitled “MinKNOW: Live basecalling and output folder structure.” Marta Verdugo, a member of the Technical Services Team with Oxford Nanopore Technologies, introduced the different data types and how MinKNOW processes signals into reads. Each read corresponds to a signal from a single DNA or RNA molecule. Basecalling starts by detecting ionic strengths that are processed using models to produce sequences of reads. MinKNOW may ‘fall behind’ with basecalling depending on GPU and CPU settings and basecalling model. Basecalling uses the flip-flop model to deconvolute the area and produce basecalls. Fast basecalling is not as computationally intensive. Fast5 and now Pod5 files can be reprocessed with new basecalling models. Fastq files consist of standard text files with blocks of header, sequence data, header with plus sign, and quality. Different base disk spaces requirements depend on number of reads and Gbases of data produced. The table shared was useful as it suggested that ~20 Gbases produce over 380 Gbytes of disk space. Storage has become a consideration as we continue sequencing!

two people working on laptop with one person pointing at screen — Photo by Christina Morillo on Pexels.com

Post Categories

Credits

Website images were purchased from and edited in Canva.com. Blog post images are from the WordPress free image library powered by Pexels. Gallery images used were taken or created by Carlos C. Goller or otherwise attribution is stated. Blog posts represent my reflections and reference relevant sources of information, including conferences, podcasts, books, and workshops when applicable. I strive for proper attribution of sources and accessibility of content. I am still early in the journey. I appreciate feedback!

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Understanding Alternative Splicing with Blessy R Package

Advancements in Antisense Oligonucleotide Design

Ultra-Fast Classifiers for Pediatric Tumors: Insights from Lennart Kester