Degree Type


Date of Award


Degree Name

Doctor of Philosophy





First Advisor

Chong Wang

Second Advisor

Peng Liu


This thesis consists of three projects motivated by biological problems: (i) detecting differentially abundant taxa in multiple metagenomic samples (chapter 2), (ii) developing a two-stage causal mediation model for identifying taxa mediating the effect of environmental conditions on an outcome in the analysis of microbiome data (chapter 3), and (iii) analyzing temporal changes of the antimicrobial susceptibility (chapter 4).

Although the emerging field of metagenomics has revolutionized our understanding of the microbial world, the analysis of metagenomic data raises some statistical challenges, including modeling high-dimensional overdispersed count data with excessive zeros. In the first project (chapter 2), we propose a hypothesis testing framework based on a Poisson Hurdle hierarchical model to address the considerable zeros issue in the metagenomic data, and a full Bayesian inference is performed to identify the differentially abundant taxa among multiple treatment groups. Simulation studies demonstrate our model outperforms the existing approaches in terms of false discovery rate control at desired level of significance and statistical power as well. In the second project (chapter 3), we develop a causal mediation model to investigate the effect of a treatment on an outcome transmitted through microbes. Considering the sparsity and high-dimensional overdispersed count natures of the metagenomic data, we propose a novel screening procedure to reduce the dimension to a moderate size. Then a Bayesian variable selection strategy with a shrink and diffuse prior is used to select the key taxa with mediation effects. The performance of the proposed method is illustrated via simulation studies.

In the third project (chapter 4), we present a hierarchical Bayesian latent class mixture model to detect the temporal changes in antibiotic resistance using minimum inhibitory concentration (MIC) values. By taking the censorship into account, our proposed approach would achieve less bias in the estimation of mean MIC. We also apply this proposed method to the data from CDC NARMS program and show that evidence of temporal changes in mean MIC exist in spite of no changes or changes of adverse direction in the proportion of resistance.

Copyright Owner

Chaohui Yuan



File Format


File Size

126 pages