Degree Type


Date of Award


Degree Name

Doctor of Philosophy


Genetics, Development and Cell Biology


Bioinformatics and Computational Biology

First Advisor

Carol R. Buell

Second Advisor

Xun Gu


With the availability of near-complete rice genome sequence,

high-quality annotation data, and large expression profile datasets, we examined

segmental duplication, intron turnover, and paralogous protein family

composition in rice. These data suggest a large percentage of the rice genome

was involved in segmental duplication creating a large number of paralogous

families. We found that singleton and paralogous family genes differed

substantially not only in their likelihood of encoding a protein of known or

putative function but also in the distribution of specific gene function. We

showed that a significant portion of the duplicated genes in rice show divergent

expression although a correlation between sequence divergence and correlation of

expression could be seen in very young genes. We observed that intron evolution

within the rice genome following segmental duplication is dominated by intron

loss rather than intron gain. In addition, with the availability of more

complete or near-complete plant genomes and transcriptomes across a wide range

of species, we identified and characterized conserved Brassicaceae-specific

genes and Arabidopsis lineage-specific genes. Lineage specific genes in the

Brassicaceae and within Arabidopsis were enriched in genes of no known function

and appear to be fast evolving at the protein sequence level.


Copyright Owner

Haining Lin



Date Available


File Format


File Size

144 pages