An efficient algorithm for kernel K means

Berlinski, Joshua

An efficient algorithm for kernel K means

File

berlinski_cc_lib.pdf (2.57 MB)

Date

2021-01-01

Authors

Berlinski, Joshua

Major Professor

Ranjan Maitra

Organizational Units

Organizational Unit

Statistics

As leaders in statistical research, collaboration, and education, the Department of Statistics at Iowa State University offers students an education like no other. We are committed to our mission of developing and applying statistical methods, and proud of our award-winning students and faculty.

Department

Statistics

Abstract

Kernel K-means extends the standard K-means clustering method to identify non-spherical clusters by performing the algorithm in a higher dimensional feature space. Typically, this extension is implemented using a method based on Lloyd's heuristic. A method based on Hartigan and Wong's heuristic is presented here, which improves the run time required to reach the final clustering. Additionally, methods for selecting the number of clusters and the tuning parameter for the Gaussian kernel are discussed. An adaptation of the K-means++ initialization method is also presented and discussed. Each of the methods is evaluated and compared on fourteen synthetic data sets, displaying the advantages of the proposed clustering method, along with limitations of the adapted parameter selection methods.

Copyright

Fri Jan 01 00:00:00 UTC 2021

Collections

Creative Components

Full item page