Degree Type
Dissertation
Date of Award
2015
Degree Name
Doctor of Philosophy
Department
Statistics
First Advisor
Stephen B. Vardeman
Second Advisor
Max D. Morris
Abstract
A variety of conditional probability models estimate the regression or class probability function for the purpose of prediction or classification. Bayesian mixture models provide flexible prediction and classification methods for modeling local linearities of the regression or class probability function. A hierarchical Bayes Gaussian mixture model is proposed that directly uses data to define a mixture prior for its Gaussian mixture component parameters.
This nonparametric Bayesian mixture model uses the stick-breaking construction of a Dirichlet process model. Prediction and classification comes directly from the posterior distribution via Gibbs sampling. Comprehensive simulation studies demonstrate performance of both the regression and classification methods. Five standard machine learning data sets show prediction and classification results competitive with local methods. A generic classification algorithm is outlined given categorical predictors. If too many categories are present or if many interaction levels affect the class probability function, no current methods can reduce bias effectively. A proposed solution is a generic way to characterize the information about the class probability function available in the predictors through likelihood ratio statistics. This proposed classifier relies on random forests to reduce bias by utilizing all information in the generated log likelihood ratio features. A simulation study and an application data set demonstrate potential advantages of this classification method for categorical predictors.
DOI
https://doi.org/10.31274/etd-180810-3955
Copyright Owner
Cory Lee Lanker
Copyright Date
2015
Language
en
File Format
application/pdf
File Size
96 pages
Recommended Citation
Lanker, Cory Lee, "Local prediction and classification techniques for machine learning and data mining" (2015). Graduate Theses and Dissertations. 14404.
https://lib.dr.iastate.edu/etd/14404