Skip to main content
Utah's Foremost Platform for Undergraduate Research Presentation
2024 Abstracts

Updated Third Generation Sequencing: Assembly Insights

Authors: Danyon Gedris, Paul Frandsen
Mentors: Paul Frandsen
Insitution: Brigham Young University

Whole genome assembly has rapidly improved as third-generation sequencing technology like PacBio HiFi and Oxford Nanopore (ONT) have bridged the gaps of complex genomes by providing high-accuracy, long read data. The improvements in these technologies have resulted in long average read lengths (>15 kbp) and sequence quality scores above 99% (>Q20). They are particularly well-suited to assembling long, repetitive regions of the genome. Current assembly techniques combine reads with identical sequences to form longer, continuous sections. In repetitive regions, this process tends to condense the repeated sequences into one shorter read, instead of preserving the continuous nature of the repeats. Long reads avoid this issue by sequencing repeats together in one continuous read. Heavy chain fibroin (h-fibroin), the gene that encodes for the primary silk protein in Trichoptera and Lepidoptera, is long (often >20 kbp) and repetitive. Recent work showed that PacBio HiFi sequencing provided higher quality assemblies of h-fibroin when compared to the last generation of ONT pores (R9.4.1) and chemistry despite having a shorter average read length. Recent advances in ONT chemistry and nanopores (R10.4.1) have led to higher quality scores, perhaps allowing successful assembly of this gene region. To better understand the advances in ONT sequencing and its ability to provide high-quality, continuous genome assemblies of complex organisms, we assess the quality of assemblies of the h-fibroin silk gene for the Trichoptera species, Arctopsyche grandis and Parapsyche elsis, using the newest ONT chemistry.