Degree Type

Thesis

Date of Award

1-1-2006

Degree Name

Master of Science

Major

Genetics

Abstract

Gene expression microarrays have resulted in a vast pool of data which is still not being utilized to its full potential. While current methods allow for considerable reliability in measuring the change in a gene's expression in response to a set of conditions, relationships between genes are usually avoided due to the high dimensionality associated with this data type. Broadly speaking, there are two major types of exploratory analyses conducted on such relationships. The first is the category of exploratory clustering algorithms. Pioneered by Michael Eisen in 1998, this includes the software Cluster that performs a hierarchical clustering analysis on the basis of pair-wise correlations. While useful due to its ease of interpretation and user friendly software, Cluster does not take higher order relationships into account and as a result can be potentially misleading. The second category is that of network models. Commonly used models are Bayesian networks and several types of Gaussian models. Network models take higher order relationships into account and, in general, improve the signal to noise ratio. The potential drawback is the complexity of visual representation, making interpretation extremely difficult. Since the results are not forced into dendrogram structure, but are represented as points in multivariate space, it can be extremely challenging to draw useful inferences in the absence of explicit a-priori information. We build a hybrid model that attempts to combine the key features of both types of approaches. We construct a hierarchical dendrogram from a conditional independence network model, facilitating the same ease of interpretation inherent of clustering algorithms while preserving the benefits of a network model, namely the consideration of higher order relationships and the improvement of the signal to noise ratio. Presently limited to datasets of about 500 genes, the approach is probably most useful for smaller microarrays conducted after a key set of significantly expressed genes have been identified from a genome wide microarray experiment.

Copyright Owner

Kalyan Chakravarthy Dudala

Language

en

OCLC Number

76945807

File Format

application/pdf

File Size

38 pages

Share

COinS