Degree Type


Date of Award


Degree Name

Doctor of Philosophy




Plant Breeding ( Predictive Plant Phenomics)

First Advisor

Asheesh K Singh


Commercial soybean varieties in the United States can be traced back to a limited number of ancestral lines, with a clear separation of ancestors for the northern and southern states. The narrow genetic base offers opportunities in adapting genomic selection to capture the ancestral source of each allele, increasing accuracy compared to the standard marker-based methods. To validate this approach, a novel chromosome segment tracing and genomic selection method was designed and compared to traditional genomic selection approaches. The results based on data from the SoyNAM indicate an additional ~16% accuracy in genomic selection through the use of allele tracing in genomic selection.Identification of the genes responsible for observed variation in soybean phenotypes allows for rapid identification of lines containing the preferred allele for use in breeding, as well as for improving genomic prediction results through the use of a priori allele effect information. To rapidly identify candidate genes for a broad spectrum of traits, we performed a combined GWAS and meta-GWAS study of NPGS germplasm characterization trials across the United States. In addition to previously reported genes, we identified candidate genes for traits such as soybean cyst nematode resistance, amino acid composition, and pod shattering. With the identification of candidate genes for a broad group of traits, as well as new findings indicating pleiotropic effects of major genes on additional traits, we expand the understanding of genetic pathways in soybean. Soybean yield performance varies considerably across environments. In order to determine the underlying environmental cause for yield variability, we developed a machine learning approach to predict performance based on weekly weather parameters. As timing of weather events determines their effect on soybean performance, we used a method called Long Short-Term Memory which is capable of learning the relative importance of these weather parameters at different timepoints to accurately predict performance. Results from this experiment suggested that the relative importance of weather parameters differed from that commonly taught in agricultural production classes. The timing of soybean flowering and maturity varies across latitudes based on daylength and genetic influences, driving the broad adaptation of soybean varieties to latitudinal bands. While the molecular control of variation in flowering and maturity timing has been well characterized, the timing of intermediate reproductive stages has received little attention from geneticists. To better understand the influence of previously identified genes on these intermediate stages, as well as identify any previously unidentified genes controlling the timing of these intermediate stages, we conducted a concurrent GWAS study and maturity isoline study. Models generated based on previously identified genes were able to capture ~70% of the genetic variation for each of the eight soybean reproductive stages, indicating high overlap between the genetic regulation of the intermediate growth stages and that of flowering and/or maturity timing.


Copyright Owner

Johnathon Michael Shook



File Format


File Size

124 pages (2505 kB)

Available for download on Tuesday, December 07, 2021