Degree Type

Creative Component

Semester of Graduation

Spring 2021

Department

Statistics

First Major Professor

Karin Dorman

Degree(s)

Master of Science (MS)

Major(s)

Statistics

Abstract

Peanut is an essential food supply plant that is an allotetroploid, which means its genome contains four copies of each chromosome from two distinct ancestral diploid species. Genotyping is the process of detecting the alleles present at the variable sites in a genome, and phasing is the process of linking the alleles into haplotypes, four haplotypes in an allotetraploid individual. When jointly genotyping and phasing allotetraploid individuals from noisy short read data, the four copies of each chromosome means the number of possible haplotypes quickly expands and overwhelms computational resources. Here, we introduce a novel, efficient strategy to genotype and phase allotetroploid individuals by prescreening the data to detect the truly variable sites. We then derive, implement, and apply an EM algorithm to estimate the haplotypes only at the loci with evidence of variation.

Copyright Owner

Hongxu Wang

File Format

PDF

Embargo Period (admin only)

5-7-2022

1

Available for download on Saturday, May 07, 2022

Share

COinS