Combining generalized linear models

Thumbnail Image
Date
2005-01-01
Authors
Chen, Lihua
Major Professor
Advisor
Yuhong Yang
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Organizational Unit
Statistics
As leaders in statistical research, collaboration, and education, the Department of Statistics at Iowa State University offers students an education like no other. We are committed to our mission of developing and applying statistical methods, and proud of our award-winning students and faculty.
Journal Issue
Is Version Of
Versions
Series
Department
Statistics
Abstract

Traditional data analysis techniques that depend on the selection of a model are vulnerable to model uncertainty. This thesis establishes some statistical properties of an alternative to model selection, a model combining method called Adaptive Regression by Mixing (ARM). This work implements and extensively studies ARM in the context of generalized linear models including ANOVA, loglinear and survival models.;We have found applications for the general idea of model combining in each of the three settings, and have derived the theoretical risk bound of the combined estimator in each.;In addition to demonstrating good theoretical properties and the empirical advantage of ARM in applications in all three settings, we have addressed specific issues and challenges posed by each setting. In combining loglinear models, we demonstrate how to apply ARM in a capture-recapture study and propose an approach to selecting a model list for combining given a high dimensional contingency table. In survival analysis, we empirically study combining different model classes. We also explore several measures to assess the predictive performance of a survival model. In the ANOVA setting, we propose model instability measures as a guide to the appropriateness of model combining in applications. We further systematically investigate the relationship between ARM performance and the underlying model structure. We propose an approach to assessing the importance of factors based on the combined estimates.;Finally, to address general computational issues, we have empirically explored the permutation times needed to produce stabilized weights for models and the relationship between ARM risk and the proportions used in the data splitting step of the algorithm. The results are largely consistent with our theoretical expectations.

Comments
Description
Keywords
Citation
Source
Subject Categories
Keywords
Copyright
Sat Jan 01 00:00:00 UTC 2005