Mixed effects modeling with missing data using quantile regression and joint modeling

Thumbnail Image
Date
2014-01-01
Authors
Fostvedt, Luke
Major Professor
Advisor
Mack C. Shelley, II
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Organizational Unit
Statistics
As leaders in statistical research, collaboration, and education, the Department of Statistics at Iowa State University offers students an education like no other. We are committed to our mission of developing and applying statistical methods, and proud of our award-winning students and faculty.
Journal Issue
Is Version Of
Versions
Series
Department
Statistics
Abstract

This thesis focuses on many different modeling approaches that can be used to

evaluate large education data sets. In education research, it is common to have

multiple sources of variation designed into the study. If these are ignored,

substantial bias can be introduced into the statistical model. We address this issue for

three different classes of models: classical linear mixed effects models, quantile regression,

and quantile regression of structural equation models. With the classical mixed effects model, we consider a

joint modeling approach to estimation and evaluate the affect of correlation on the estimation

of both fixed and random effects. For the structural equations models, we have evaluated the

performance of a quantile regression imputation model. The other quantile regression model

uses the Asymmetric Laplace Distribution to incorporate random effects with estimation

performed using a bayesian approach.

Multivariate mixed effects models can be used to simultaneously model several outcomes. We look at the effect of different correlations on the model estimation. In our simulations, all of the off-diagonal correlations were the same. However, the estimation allowed the correlation matrix to be unstructured. We looked at both a longitudinal and hierarchical situation where predicted and parameters are selected to mimic situations seen in education research. The simulation results show that the joint modeling approach does not outperform a univariate modeling approach. The estimation of the covariances and correlations are unbiased when only random intercepts are included in the model. When random slopes are also included, the random effect variances tend to be underestimated using the joint modeling approach. Estimate of the correlations between similar random effects are good, but estimates of non-similar random effects exhibit severe bias.

Missing response and covariate values are common issues in large scale studies. We evaluate an imputation approach for quantile regression with recursive structural equations. In these models, the estimation of a structural effect is the primary concern. We apply an imputation approach that uses quantile regression to impute missing values. We provide simulations evaluating the estimation and 95\% coverage from this approach both single-level and hierarchical data. Using this imputation approach for a recursive structural equation model, we provide an application studying the effect of selected quantiles of economic, social, and cultural status (ESCS) on selected quantiles for student test scores in mathematics, reading, and science from the PISA 2012 survey. Our findings show that when the rate of missingness is low ($\sim$10\%), the approach produces unbiased results with good coverage. When the rate of missingness is high ($\sim$40\%), the estimates show large bias and poor coverage. For the PISA 2012 application, the rate of missingness in the selected variables is low leading us to believe that the estimates are valid.

There is a dearth of quantile regression extensions to a mixed effects setting. In this paper we consider a bayesian approach using the asymmetric Laplace distribution (ALD). The loss function minimized in simple quantile regression is part of the kernel of the ALD. Further, the ALD can be represented as a mixture of normal and exponential distributions. Using this representation, conjugate prior distributions can be selected enabling straightforward gibbs sampling. Using the ALD, we model data from a large education study evaluating the impact of an intervention on critical thinking skills. We present two different models: a two-level model for all students and a two-level model for special education students. For the special education student model, we incorporate a quantile regression model for each level. This allows us to evaluate the impacts of the intervention in the tails of the student level and school level. For all the students, our results show that there is a significant impact from the intervention on the lower achieving students. For the special education students, the intervention is not significant, but the point estimates are mostly negative.

Comments
Description
Keywords
Citation
Source
Subject Categories
Copyright
Wed Jan 01 00:00:00 UTC 2014