Date of Award
Doctor of Philosophy
Mack C. Shelley, II
This thesis focuses on many different modeling approaches that can be used to
evaluate large education data sets. In education research, it is common to have
multiple sources of variation designed into the study. If these are ignored,
substantial bias can be introduced into the statistical model. We address this issue for
three different classes of models: classical linear mixed effects models, quantile regression,
and quantile regression of structural equation models. With the classical mixed effects model, we consider a
joint modeling approach to estimation and evaluate the affect of correlation on the estimation
of both fixed and random effects. For the structural equations models, we have evaluated the
performance of a quantile regression imputation model. The other quantile regression model
uses the Asymmetric Laplace Distribution to incorporate random effects with estimation
performed using a bayesian approach.
Multivariate mixed effects models can be used to simultaneously model several outcomes. We look at the effect of different correlations on the model estimation. In our simulations, all of the off-diagonal correlations were the same. However, the estimation allowed the correlation matrix to be unstructured. We looked at both a longitudinal and hierarchical situation where predicted and parameters are selected to mimic situations seen in education research. The simulation results show that the joint modeling approach does not outperform a univariate modeling approach. The estimation of the covariances and correlations are unbiased when only random intercepts are included in the model. When random slopes are also included, the random effect variances tend to be underestimated using the joint modeling approach. Estimate of the correlations between similar random effects are good, but estimates of non-similar random effects exhibit severe bias.
Missing response and covariate values are common issues in large scale studies. We evaluate an imputation approach for quantile regression with recursive structural equations. In these models, the estimation of a structural effect is the primary concern. We apply an imputation approach that uses quantile regression to impute missing values. We provide simulations evaluating the estimation and 95\% coverage from this approach both single-level and hierarchical data. Using this imputation approach for a recursive structural equation model, we provide an application studying the effect of selected quantiles of economic, social, and cultural status (ESCS) on selected quantiles for student test scores in mathematics, reading, and science from the PISA 2012 survey. Our findings show that when the rate of missingness is low ($\sim$10\%), the approach produces unbiased results with good coverage. When the rate of missingness is high ($\sim$40\%), the estimates show large bias and poor coverage. For the PISA 2012 application, the rate of missingness in the selected variables is low leading us to believe that the estimates are valid.
There is a dearth of quantile regression extensions to a mixed effects setting. In this paper we consider a bayesian approach using the asymmetric Laplace distribution (ALD). The loss function minimized in simple quantile regression is part of the kernel of the ALD. Further, the ALD can be represented as a mixture of normal and exponential distributions. Using this representation, conjugate prior distributions can be selected enabling straightforward gibbs sampling. Using the ALD, we model data from a large education study evaluating the impact of an intervention on critical thinking skills. We present two different models: a two-level model for all students and a two-level model for special education students. For the special education student model, we incorporate a quantile regression model for each level. This allows us to evaluate the impacts of the intervention in the tails of the student level and school level. For all the students, our results show that there is a significant impact from the intervention on the lower achieving students. For the special education students, the intervention is not significant, but the point estimates are mostly negative.
Luke Karsten Fostvedt
Fostvedt, Luke Karsten, "Mixed effects modeling with missing data using quantile regression and joint modeling" (2014). Graduate Theses and Dissertations. 14153.