Date of Award
Doctor of Philosophy
This dissertation is a collection of three papers on the development of statistical methods for variable selection in ultra-high dimensional functional linear models. These papers are motivated by functional genome-wide association studies (fGWAS). In the first paper, we consider a class of concurrent functional linear models where the responses are phenotypes repeatedly measured over time hence modeled as functional data. The model includes the effects of functional environmental covariates, ultra-high dimensional genetic covariates, and their interactions. We approximate the coefficient functions using B-splines and propose two forward selection procedures based on weighted least squares to select significant main and interaction effects. We consider both sparse and dense functional data in a unified framework and propose a functional Bayesian Information Criterion (fBIC) as the stopping rule. We show that the proposed methods achieve the model selection consistency in the fixed dimensional case and the sure screening property in the ultra-high dimensional case. Simulation studies are provided to illustrate that the proposed forward selection procedures work well under both sparse and dense functional settings. Analysis of a dataset from the Alzheimer′s Disease Neuroimaging Initiative (ADNI) is used to demonstrate the application of the proposed methodologies to fGWAS.
The second paper extends the proposed forward selection procedures to the ultra-high dimensional partially functional linear model. In the third paper, we develop the ifFS package in R. This package implements the two forward selection procedures for the concurrent functional linear model and the partially functional linear model.
Rodrigo Plazola Ortiz
Plazola Ortiz, Rodrigo, "Interaction forward selection in ultra-high-dimensional functional linear models" (2020). Graduate Theses and Dissertations. 18057.
Available for download on Thursday, June 16, 2022