Date of Award
Doctor of Philosophy
Animal Breeding and Genetics (Quantitative Genetics)
Dorian J. Garrick
The major task of animal breeding is to achieve genetic improvement for traits of economic importance in livestock species. Genetic improvement of a specific trait could be measured as genetic gain per year within a population, which is determined by the selection intensity, the standard deviation of true breeding value, the accuracy of estimated breeding value (EBV), and the generation interval. One main strategy to improve genetic gain is to improve the selection accuracy in EBV. The Research topics concerning improving selection accuracy of EBV have been of great interest in livestock breeding for decades.
The advent of genomic prediction has enhanced prediction accuracy of EBV and revolutionized selection strategies in livestock. Genomic prediction utilizes statistical models to predict genomic estimated breeding values (GEBV) based on genome-wide single nucleotide polymorphisms (SNPs). The accuracy of genomic prediction is affected by various of factors, such as the size of reference population, the linkage disequilibrium (LD) between markers and quantitative trait loci (QTL), and the genetic properties of traits. Bayesian hierarchical models, which consider all unknown parameters and estimating all SNP effects simultaneously, are widely used in whole genome analyses.
Due to LD between SNPs and QTL, genomic prediction can capture the effects of QTL for the traits of interest. Alleles of different loci cluster together in a haplotype blocks, which are passed from the parents to the progeny. Compared to SNP effect model, haplotype effect model reduces dimensions of parameters and utilizes multi-loci LD. Fitting haplotype alleles is likely to capture QTL effects better than fitting SNPs genotypes. However, the occurrence of recombination during meiosis breaks down the haplotype blocks which would influence the accuracy of both population-wide and family-wide haplotype reconstructions, and erodes the LD between SNPs and QTL. Meiotic recombination events distribute non-randomly along the genome in many species. It occurs more frequently in recombination hotspots, which are defined as short intervals with significantly higher recombination rates than the average recombination rate on each chromosome. Characterizing recombination events will enable the identification of haplotype diversity, elucidate genetic variation along the genome, and eventually improve genomic prediction. The first objective of this thesis was to evaluate the relationship between recombination and haplotype reconstruction, to investigate factors affecting recombination, to recognize recombination hotspots, and to locate QTL influencing genome-wide recombination rate in beef cattle and layer chickens.
Genome-wide association studies were performed to identify the QTL controlling the traits of interest in livestock species. The availability of dense SNP genotypes along the genome aids the detection of causal genetic variants. Accuracy of genomic prediction can be enhanced by integrating dense SNP genotypes with the genotypes of causal QTL. The second objective of this thesis was to identify causal QTL associated with growth and body composition traits in Brangus beef cattle.
Simulation studies have shown that using distant ancestors for prediction of EBV or GEBV of young animals is less informative than using close relatives. The effect of including distant ancestral generations in the training set on prediction accuracy has not been well studied in a real population under selection. The third objective of this thesis was to assess the optimal numbers of training generations needed to yield the maximum genetic prediction accuracy for various of traits in layer chickens.
Chapter 2 studied 2775 Angus and 1485 Limousin cattle genotyped with the Illumina BovineSNP50 chip. Haplotype phasing was performed using DAGPHASE and BEAGLE based on UMD3.1 assembly. Recombination events were identified by comparing the reconstructed chromosomal haplotypes between sire-offspring pairs. The average genome-wide recombination number for sires were recorded as phenotypes. The BayesB approach was used to identify QTL influencing genome-wide recombination events. Genotype imputation from a 7K subset to the 50K chip in Angus population was conducted by BEAGLE. Due to the linkage information from relatives, DAGPHASE was superior to BEAGLE in haplotype phasing. In Angus, 427 1-Mb windows containing recombination hotspots were detected and 348 recombination hotspots were identified in Limousin. The regions with high recombination rates had low accuracy of haplotype phasing and genotype imputation. Limited population sizes and half-sib family sizes, along with the occurrences of gene conversion, genotyping errors, and map errors, hinder the identification of recombination events. Different QTL regions influencing genome-wide recombination were identified in the two breeds.
A total of 1200 white layers genotyped with 580K SNP panel and 5108 brown layers genotyped with 42K SNP panel were studied in Chapter 3. Recombination events were identified using LINKPHASE within half-sib families. The BayesB approach was used to identify QTL influencing genome-wide recombination in each line. The number of recombination hotspots detected in white and brown layers were 190 and 199, respectively. Only 28 of them were common to both lines. Recombination rates differed in lines and sexes. Family structure, marker density, inbreeding level, and haplotype structure could influence the identification of recombination events. Chromosome size, GC content, and CpG island density showed negative correlations with recombination rate. Several significant QTL windows, which harbor candidate genes were identified in the 2 lines. In general, recombination rate is a complex, breed-specific, polygeneic trait. Identification of recombination provides us opportunities to improve map assembly, and enhance haplotype reconstruction. Implementing recombination information will help to improve genomic prediction in livestock breeding.
In Chapter 4, a total of 1537 Brangus beef cattle were genotyped with Bovine50K, GGPHD77K, or BovineHD770K SNP chip. FImpute was used to impute Bovine50K or GGPHD77K to BovineHD770K. The BayesB approach with weighting factors was used to map QTL in each trait. A total of 18 different QTL regions were detected across 9 studied traits. Among these QTL, 11 were trait-specific, while others were pleiotropic. Five large-effect QTL were found segregating in other breeds, including Angus and Nellore. These identified QTL will aid our understanding of the biological processes of growth and body composition traits in beef cattle and ultimately help to enhance genomic prediction across multi-breed.
Phenotypic records from 16 traits on 17 793 birds over 9 non-overlapping generations were analyzed in Chapter 5. Among these birds, 5108 of them had genotypes for 23 098 segregating SNPs. Two prediction methods, best linear unbiased prediction model with pedigree relationships, and BayesB were applied to predict EBV or GEBV in each validation generation based on varying numbers of ancestral training generations. The optimal number of training generations that resulted in the highest prediction accuracy of GEBV was obtained for each trait. The relationship between optimal number of training generations and heritability of traits was evaluated. Prediction accuracy of EBV and GEBV increased by including close ancestral generations, but either reached an asymptote or decreased slightly when distant ancestral generations were included in training. The optimal number of training generations increased with heritability. Based on the studied dataset, 4 or 5 training generations is optimal to most of polygenic traits.
In summary, this thesis investigated genome-wide recombination mechanisms in beef cattle and layer chickens, identified positional candidate genes for growth and meat production traits in Brangus beef cattle, and assessed the effect of distant ancestral generations on genomic prediction accuracy in layer chickens. The locations of recombination hotspots and QTL which control genome-wide recombination rates were identified in both beef cattle and layer chickens. The identification of recombination patterns along the genome will aid in better defining haplotype blocks and improving genomic prediction accuracy. Selection on recombination rate is challenging, but it has potential benefits to increase genetic gain in livestock breeding system. In Brangus beef cattle, 7 pleiotropic QTL and 11 trait-specific QTL were identified which will help us better understanding the biological processes accounting for variation in growth and body composition traits in Brangus cattle. Utilizing genotypes of identified causal QTL will enhance genomic prediction. Since the effect of adding distant ancestral generations in training on prediction accuracy differed between traits, different prediction strategies should be applied based on the importance of selected traits in a specific breeding population. Implementing these findings in livestock breeding program will help to improve the genomic prediction accuracy, and ultimately to increase short-term and long-term genetic gain for selected traits.
Weng, Ziqing, "Whole genome analyses in cattle and chickens using Bayesian methods" (2015). Graduate Theses and Dissertations. 14874.