Analysis of G-quadruplexes as environmental sensors: Novel statistical models and computational algorithms enable interpretation of complex gene expression patterns for maize under salt stress conditions
Date of Award
Doctor of Philosophy
Genetics, Development and Cell Biology
Bioinformatics and Computational Biology
Carolyn J. Lawrence-Dill
Matthew B. Hufford
The occurrence of G-quadruplex (G4) structures in both genic and non-genic sequences have been well-documented. However, even in genic regions the biological functions of these motifs remains poorly understood, though their potential to act in a regulatory fashion has been hypothesized. With the recent development of next-generation sequencing technology, we have accumulated genomic and transcriptomic sequences from various species and tissues. Coupled with pattern recognition software that can identify putative G4 sequences, the time is right for tackling the question of whether and how G4’s are involved in regulating gene expression. Previous studies suggested that G4 conformation can be dependent on cation type and concentration, along with G4 motif patterns differences (e.g., number of consecutive guanines). It also has been shown that G4 function may be associated with the location relative to a given gene’s structural elements (transcription start site [TSS], exon/intron boundaries, etc.).
My project focused on the expression of G4-containing genes from maize tissues under various abiotic stress conditions, including salt stress, which would be likely to change physiological cation concentrations. I quantified, compared, and visualized expression of G4-containing gene groups by developing and applying novel computational algorithms and statistical models. These methods were packaged into a software program I released on a web server called C-REx (http://c-rex.dill-picl.org/). I found that under salt stress conditions, transcription factors (TFs) with a G4 on the anti-sense strand upstream of the TSS are 455% more likely to be up-regulated than non-G4 genes. Likewise, transcription factors with a G4 on the anti-sense strand just downstream of the TSS are 259% more likely to be up-regulated. In addition, among G4 transcription factors that are up-regulated, heat shock factors are significantly enriched. On the other hand, under salt stress conditions non-TF genes with a G4 on anti-sense strand upstream of the TSS are 157% more likely to be down-regulated, and those with the G4 on the anti-sense strand downstream of the TSS are 124% more likely to be down-regulated. Through G4 sequence feature analysis, we found that the length of G-runs was significantly associated with whether genes were switched ‘on’ or ‘off’ in salt stress conditions. The shortest G-runs were associated with G4 motifs in TF genes that were switched ‘on’ and longest G-runs were associated with G4s in non-TF genes that were switched ‘off’. These findings suggest that salt stress resilience could potentially be improved in maize by selecting for natural gene variants with specific G4 constitutions or by introducing specific G4 motifs of varying lengths into TF and non-TF genes involved in response to salt stress.
He, Mingze, "Analysis of G-quadruplexes as environmental sensors: Novel statistical models and computational algorithms enable interpretation of complex gene expression patterns for maize under salt stress conditions" (2018). Graduate Theses and Dissertations. 16815.