Degree Type

Creative Component

Semester of Graduation

Spring 2021

Department

Statistics

First Major Professor

Ranjan Maitra

Degree(s)

Master of Science (MS)

Major(s)

Statistics

Abstract

Kernel K-means extends the standard K-means clustering method to identify non-spherical clusters by performing the algorithm in a higher dimensional feature space. Typically, this extension is implemented using a method based on Lloyd's heuristic. A method based on Hartigan and Wong's heuristic is presented here, which improves the run time required to reach the final clustering. Additionally, methods for selecting the number of clusters and the tuning parameter for the Gaussian kernel are discussed. An adaptation of the K-means++ initialization method is also presented and discussed. Each of the methods is evaluated and compared on fourteen synthetic data sets, displaying the advantages of the proposed clustering method, along with limitations of the adapted parameter selection methods.

Copyright Owner

Berlinski, Joshua

File Format

PDF

Embargo Period (admin only)

4-23-2021

1

Share

COinS