Skip to main content
Utah's Foremost Platform for Undergraduate Research Presentation
2021 Abstracts

PRSKB: A Web-based Polygenic Risk Score Calculator and Knowledge Base

Presenters: Ed Ringger, College of Life Sciences, Biology
Authors: Madeline Page, Matthew Cloward, Elizabeth Vance, Louisa Dayton, Ed Ringger, Justin B. Miller, John S.K. Kauwe
Faculty Advisors: Justin Miller, College of Life Sciences, Biology
Institution: Brigham Young University

Large genetic cohorts have led to established databases consisting of thousands of whole genome sequences linked to specific phenotypes. Genome-wide association (GWA) studies take advantage of these databases by calculating associations (odds ratios) between single nucleotide polymorphisms and specific phenotypes of interest. The aggregate of these odds ratios for a given trait constitutes a polygenic risk score for that trait. Although these scores can be powerful tools for determining the genetic factors contributing to human disease, identifying reliable GWA studies is a decentralized and time-consuming manual process. Here we introduce the Polygenic Risk Score Knowledge Base (PRSKB), a web server that automates the end-to-end process of calculating polygenic risk scores for an individual or group of individuals. Given a variant call format (VCF) file, PRSKB can calculate polygenic risk scores for an individual using 2634 studies across 785 traits. Scores can be calculated via the PRSKB command line interface or the PRSKB website ( https://prs.byu.edu). GWA study associations housed in the NHGRI-EBI GWAS Catalog were parsed based on odds ratios and risk alleles on autosomal chromosomes and stored in our database. To demonstrate the effectiveness of our tool and give ballpark ranges for common scores, we calculated risk scores for all traits for over 400,000 individuals from the UK Biobank. PRSKB’s vast database and user-friendly interface makes it unique among polygenic risk score calculators. The ability to simultaneously calculate polygenic risk scores for thousands of individuals across over two thousand studies in a centralized database effectively reduces the time and effort required to conduct analyses using risk scores. In addition, the use of our tool will limit potential confounding factors in datasets by identifying individuals with high polygenic risk scores in phenotypes not directly studied.