Genome-wide prediction of breeding values and mapping of quantitative trait loci in stratified and admixed populations

ShaarbafToosi, Ali

Genome-wide prediction of breeding values and mapping of quantitative trait loci in stratified and admixed populations

File

ShaarbafToosi_iastate_0097E_13016.pdf (4.55 MB)

Date

2012-01-01

Authors

ShaarbafToosi, Ali

Advisor

Rohan L. Fernando

Altmetrics

Organizational Units

Organizational Unit

Animal Science

The Department of Animal Science originally concerned itself with teaching the selection, breeding, feeding and care of livestock. Today it continues this study of the symbiotic relationship between animals and humans, with practical focuses on agribusiness, science, and animal management.

History
The Department of Animal Husbandry was established in 1898. The name of the department was changed to the Department of Animal Science in 1962. The Department of Poultry Science was merged into the department in 1971.

Historical Names

Department of Animal Husbandry (1898–1962)

Related Units

College of Agricultural and Life Sciences (parent college)

Department of Poultry Science (merged with, 1971)

Department

Animal Science

Abstract

Ideally genome-wide association studies require homogenous samples originating from randomly mating populations with minimal pedigree relationship. However, in reality such samples are very hard to collect. Non-random mating combined with artificial selection has created complex pattern of population structure and relationship in commercial crop and livestock populations. This requires proper modeling of population structure and kinship a necessary step of all genome-wide association studies. Otherwise, the risk of both false-positives (declaring a marker as significant without it be linked to a QTL) and false-negatives (markers linked to a QTL declared as non-significant) increases dramatically.

In this thesis, we first applied genomic selection (GS) approach to develop equations for prediction of breeding values of purebred candidates based on a model trained on an admixed or crossbred population. In this approach all markers effects are treated as random and are fitted simultaneously. It was hypothesized that given a high-density marker data and using the GS approach; training in a crossbred or admixed population could be as accurate as training in a purebred population that is the target of selection. In a stochastic simulation study, it was shown that both crossbred and admixed populations could predict breeding values of a purebred population, without the need for explicitly modeling of breed composition and pedigree relationship. However, accuracy of GS was greatly reduced when genes from the target pure breed were not included in the admixed or crossbred training population. In addition, it was shown that the accuracy of GS depends on the genetic distance between the training and validation population, the closer the relationship between the two the higher was the prediction accuracy. Further, increasing of marker density improved the accuracy of prediction especially when a crossbred population has been used as the training dataset. Considering haplotypes with weak linkage disequilibrium (LD), the crossbreds showed extensive LD, whereas the LD in the purebreds was confined to smaller segments. In contrast, examination of the length of haplotypes with strong LD indicated that these haplotypes are much shorter in crossbreds than that in purebreds. Our results showed that in crossbred populations the number of haplotypes with strong LD is less than that in the purebred populations. The findings of this research suggested that the crossbred populations are more suitable for QTL fine mapping than the purebreds.

In addition, in another simulation study we compared power, false-positive rate, accuracy and positive predictive value of QTL mapping in an admixed population with and without modeling of breed composition. The performance of ordinary least square (OLS) and mixed model methods (MLM), both fitting one-marker-at-a-time, were compared to that of a Bayesian multiple-regression (BMR) method that fitted all markers simultaneously. The OLS method showed the highest rate of false-positives due to ignoring breed composition and pedigree relationship. The MLM approach showed spurious false-positives when breed composition was not accounted for. The BMR outperformed both OLS and MLM approaches. It was shown that BMR could mitigate the confounding effects of breed composition and relationship without compromising its power. In contrast to the MLM where fitting of breed composition reduced both its power and false-positive rates, when breed composition was considered in the BMR it resulted in loss of power without a change of false-positive rate. It was concluded that the BMR is able to self-correct for the effects of population structure and relatedness.

Copyright

Sun Jan 01 00:00:00 UTC 2012

Collections

Theses and Dissertations

Full item page