The paleopolyploid nature of the soybean genome: duplicate gene identification, regional sequence characterization and expression studies

Thumbnail Image
Date
2006-01-01
Authors
Schlueter, Jessica
Major Professor
Advisor
Randy C. Shoemaker
Volker Brendel
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Journal Issue
Is Version Of
Versions
Series
Department
Theses & dissertations (Interdisciplinary)
Abstract

The paleopolyploid nature of the soybean genome was investigated through bioinformatic analysis of ESTs, sequencing of homeologous BAC clones and functional analysis of retained duplicate genes. From ESTs, 294 soybean contig pairs were identified representing retained transcribed duplicate genes. Clustering of synonymous distances between each gene pair identified two mixtures of normal distributions corresponding to two rounds of genome duplication approximately 14.5 and 45 million years ago. Ratios of nonsynonymous to synonymous distances showed that most duplicate gene pairs are under purifying selection. Pearson correlation coefficients of EST-based expression patterns between duplicate pairs illustrated both retain expression and uncorrelated expression. Homeologous soybean BAC clones were sequenced to better understand structural divergence in paleopolyploid regions. Annotation of these BACs anchored by N-hydroxycinnamoyl/benzoyltransferase (HCBT) genes showed that gene conservation in both order and orientation is surprisingly strongly. An extended comparison to Medicago truncatula and Arabidopsis thaliana demonstrated a network of synteny with conserved genes interrupted by blocks with no synteny. Another 4 BACs corresponding to five o-6 fatty acid desaturase (FAD2) genes were sequenced. These desaturases are responsible for the conversion of oleic acid to linoleic acid. Sequence comparisons between the regions showed that the soybean genome is a mosaic with some regions retaining high sequence conservation in both the genic and intergenic regions while others have only FAD2 genes in common. Genetic linkage analysis of all sequenced BACs showed that most mapped to linkage groups with previously identified syntenic markers. Reverse transcriptase-PCR analysis of the retained homeologs showed that in the tissues sampled, some homeologs have not diverged greatly in their expression profiles, while others provide excellent examples of potential sub- or neofunctionalization. Reverse transcriptase-PCR analysis of the five FAD2 genes showed that FAD2-2B and FAD2-2C copies are the best candidates for temperature dependent expression changes in developing pod tissue. Semi-quantitative RT-PCR confirmed these results with FAD2-2C showing upwards of an eight-fold increase in expression in developing pods grown in cooler conditions relative to those grown in warm conditions. These results suggest a candidate gene for decreasing the levels of linoleic acid in developing pods grown in cooler climates.

Comments
Description
Keywords
Citation
Source
Subject Categories
Copyright
Sun Jan 01 00:00:00 UTC 2006