Date of Award
Doctor of Philosophy
Genetics, Development and Cell Biology
Bioinformatics and Computational Biology
Volker P. Brendel
Amy L. Toth
Recent advances in DNA sequencing technology and a proliferation of new algorithms for assembling, annotating, and analyzing genomes have made genome-scale sequencing more accessible than ever. As a result, the last several years have seen a dramatic increase in the number of published draft genomes. Many important research problems revolve around interpretation of these draft genomes: What are the contents of a genome? How many genes are there? Are there any conspicuous losses of genes of interest? Is the genome compact, with genes clustered very tightly, or are genes separated by large intergenic spaces? Are intergenic spaces distributed evenly throughout the genome? Which characteristics of genome composition and organization are well conserved, and which appear to be unique, warranting further investigation?
In this dissertation, I investigate this topic in multiple contexts. First, I present a draft genome of the paper wasp Polistes dominula, a model species for study of the evolution of social behavior. The genome of Polistes is similar to other social insects in many respects, but has an extremely biased nucleotide composition and shows some evidence of a reduction in genome size. Analysis of transcriptome and methylome data from queen and worker wasps reveals evidence of caste-related differences in gene expression, as well as a tremendous reduction in DNA methylation, previously thought to be an important factor in caste differentiation.
Second, I investigate questions of genome composition and organization more generally. Given a new genome assembly and annotation, what can we determine quickly about the genome’s contents? What can be said about the distribution of genes and the overall “compactness” of the genome? How should this be compared to previously published results for related species? I present a framework (and related tools) that provides precise solutions to these questions, and discuss insights gained by applying these tools to study various model organism genomes.
Daniel Scott Standage
Standage, Daniel Scott, "Scalable and reproducible genome analysis in the age of next-generation genome sequencing" (2016). Graduate Theses and Dissertations. 15225.