Degree Type

Dissertation

Date of Award

2010

Degree Name

Doctor of Philosophy

Department

Electrical and Computer Engineering

Major

Bioinformatics and Computational Biology

First Advisor

Julie A. Dickerson

Second Advisor

Basil Nikolau

Abstract

The common goal for biological research is to develop models for the biological processes we seek to understand. Such models, in the form of biochemical pathway networks which describe the physical interactions between a living cell's genes, transcripts, proteins, and metabolites ("Omics"), accumulate in different repositories for several model organisms as well as non-model organisms. This thesis presents a set of integrated statistical bioinformatics tools that address key problems in integrating large-scale Omics datasets with pathway network models. A hardware accelerated non-parametric Omics mining method (Monte Carlo on the GPU) allows faster screening of custom test statistics and functions. A software platform for mining pathway databases (PathwayAccess) confers knowledge integration and comparison. Omics and pathway mining are combined for a novel method for statistically discriminating functionally meaningful subnetworks for their interaction with lists of entities mined from Omics data, so that software can intelligently mine large and complex pathway databases to answer a wide variety of questions and generate hypotheses (Discriminating Omics Response Groups in Pathways). The method, called PathwayFlow, can discriminate pathways, reactions, metabolite classes, or any other biological entity grouping (Response Groups), and automatically accounts for connectivity-caused biases in the pathway network. It also differentiates between regulators (or inputs) and regulatees (or outputs) for a given Query List of Omics entities. It is applied to three real datasets: a simple E. coli gene expression dataset which validates the method, a more complex Vitis gene expression dataset which complements functional enrichment analysis (Grapevine's Response to Short Days), and an ultra-high throughput re-sequencing dataset for assessing genetic differences between two wine grape varieties (DNA Sequencing Appendix).

Copyright Owner

John Louis Van Hemert

Language

en

Date Available

2012-04-30

File Format

application/pdf

File Size

171 pages

Share

COinS