Defense Date

7-26-2024

Graduation Date

Fall 12-20-2024

Availability

Immediate Access

Submission Type

thesis

Degree Name

MS

Department

Computational Mathematics

School

School of Science and Engineering

Committee Chair

Lauren Sugden

Committee Member

John Fleming

Keywords

hidden markov model, population genetics, positive selection, viterbi, python

Abstract

Identifying adaptive mutations in genetic data is challenging due to the low frequency of occurrence of such events, and because signatures of selection are intertwined with the footprints of various other evolutionary forces that shape our genomes. Even when a larger region appears to be under selection, genomic sites that are linked to adaptive mutations have similar statistical signals, and thus can obfuscate the identification of the actual adaptive mutation. The new method described here uses a Hidden Markov Model that allows for classification of neutral, linked, and sweep (adaptive mutation) genomic sites. This model is general and can be scaled to allow for an arbitrary number of classes. Using simulated genetic data, site-specific selection statistics are taken as input, and site probabilities and classifications are the resulting outputs. The Viterbi algorithm is used to identify the most likely path through all classes along the sequence. A stochastic backtrace method allows for the identification of multiple possible paths. By Scott McCallum August 2024 v These methods, in combination with enforcing sweep events, help to identify regions under selection, and allow for better localization of adaptive mutations.

Language

English

Available for download on Friday, January 31, 2025

Share

COinS