Authors: Silvi Rouskin, Matthew Allan, Dragui Salazar
Mentors: Merrill Halling
Insitution: Utah Valley University
A 3’ untranslated region (UTR) is a non-coding region between the stop codon and the 3’ end of an mRNA. 3’ UTR can regulate post-transcriptional gene expression by influencing mRNA stability, translation, and localization. They fold into complex structures that contain elements and binding sites that interact with various molecules, including proteins and microRNAs (miRNAs). Despite the recognized importance of 3’ UTRs and their structural features, the vast majority of their structures in humans remain unknown. Indeed, the structures of long RNAs in general have been difficult to solve due to their heterogeneity and to the paucity of known, ground-truth RNA structures for training and validating models. This project aims to circumvent these limitations by characterizing the structures of 3,000 to 4,000 human 3’ UTRs. The project can be simply described as follows. The cDNA is received and prepared for PCR with the needed primers. After the PCR, genetic material is transcribed into RNA where dimethyl sulfate mutational profiling with sequencing (DMS-MapSeq) will be introduced. After this, it is reverse transcribed and prepared for sequencing. The project implements thousands of primers to facilitate the comprehensive identification of genes. The vast dataset of structure profiles will be used to develop an advanced machine learning algorithm to predict first the DMS-MapSeq results and eventually the structure of an RNA solely from its sequence. The preliminary results show that it is possible to determine hundreds, even thousands of 3’ UTR structures using DMS-MapSeq and the creating an accurate image of such structure. These results also contain druggable pockets that can be used in RNA based therapeutics in a near future.