Degree Type

Creative Component

Semester of Graduation

Fall 2019

Department

Statistics

First Major Professor

Peng Liu

Second Major Professor

Yumou Qiu

Degree(s)

Master of Science (MS)

Major(s)

Statistics

Abstract

A key aim in system biology is to understand molecules’ structural and functional processes in a living cell. With the development of high-throughput technologies, quantitative methods can be applied on large scale ‘omics’ datasets. Due to the nature of intricate relationships of all molecules in a cell, network-based methods have become a popular approach to reconstruct gene-gene, gene-protein, and protein-protein interactions. Among different network approaches, Gaussian Graphical Model shows advantages in reconstructing gene co-expression networks because it is able to capture the direct association between genes with partial correlations. However, estimating and inferring partial correlations under the high-dimensional setting are very challenging. A method utilizing penalized partial correlations called exact hypothesis testing for shrinkage based Gaussian graphical models (Shrunk MLE) is able to overcome the high-dimension problem. However, the statistical inference of such penalized partial correlations is not satisfying. In this project, a novel network inference method, named c-level Partial Correlation Graph (c-level PCG), is applied to the gene expression dataset to model gene-gene direct association. It overcomes the ill-condition of p greater than n and successfully infers estimated partial correlation with false discovery rate controlled. Compared to Shrunk MLE, c-level PCG is able to achieve much higher statistical power and control the false discovery rate at the same time, according to our simulation studies.

Copyright Owner

Wang, Hao

File Format

PDF

Hao_cc_SI.pdf (663 kB)
Supplemental Materials

Share

COinS