Degree Type

Dissertation

Date of Award

2011

Degree Name

Doctor of Philosophy

Department

Mathematics

First Advisor

Anastasios Matzavinos

Abstract

The need to interpret and extract possible inferences from high-dimensional data sets has led over the past decades to the development of dimensionality reduction and data clustering techniques. Scientific and technological applications of clustering methodologies include among others bioinformatics, biomedical image analysis and biological data mining. Current research in data clustering focuses on identifying and exploiting information on dataset geometry and on developing robust algorithms for noisy datasets. Recent approaches based on spectral graph theory have been devised to efficiently handle dataset geometries exhibiting a manifold structure, and fuzzy clustering methods have been developed that assign cluster membership probabilities to data that cannot be readily assigned to a specific cluster.

In this thesis, we develop a family of new data clustering algorithms that combine the strengths of existing spectral approaches to clustering with various desirable properties of fuzzy methods. More precisely, we consider a slate of "random-walk" distances arising in the context of several weighted graphs formed from the data set, which allow to assign "fuzzy" variables to data points which respect in many ways their geometry. The developed methodology groups together data which are in a sense "well-connected", as in spectral clustering, but also assigns to them membership values as in other commonly used fuzzy clustering approaches. This approach is very well suited for image analysis applications and, in particular, we use it to develop a novel facial recognition system that outperforms other well-established methods.

DOI

https://doi.org/10.31274/etd-180810-285

Copyright Owner

Sijia Liu

Language

en

Date Available

2012-04-06

File Format

application/pdf

File Size

130 pages

Included in

Mathematics Commons

Share

COinS