Insights into the rice and Arabidopsis genomes: intron fates, paralogs, and lineage-specific genes
Date
Authors
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Journal Issue
Is Version Of
Versions
Series
Department
Abstract
With the availability of near-complete rice genome sequence,
high-quality annotation data, and large expression profile datasets, we examined
segmental duplication, intron turnover, and paralogous protein family
composition in rice. These data suggest a large percentage of the rice genome
was involved in segmental duplication creating a large number of paralogous
families. We found that singleton and paralogous family genes differed
substantially not only in their likelihood of encoding a protein of known or
putative function but also in the distribution of specific gene function. We
showed that a significant portion of the duplicated genes in rice show divergent
expression although a correlation between sequence divergence and correlation of
expression could be seen in very young genes. We observed that intron evolution
within the rice genome following segmental duplication is dominated by intron
loss rather than intron gain. In addition, with the availability of more
complete or near-complete plant genomes and transcriptomes across a wide range
of species, we identified and characterized conserved Brassicaceae-specific
genes and Arabidopsis lineage-specific genes. Lineage specific genes in the
Brassicaceae and within Arabidopsis were enriched in genes of no known function
and appear to be fast evolving at the protein sequence level.