Topics on high dimensional statistical inference and ANOVA for longitudinal data

Thumbnail Image
Date
2011-01-01
Authors
Zhong, Pingshou
Major Professor
Advisor
Song X. Chen
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Organizational Unit
Statistics
As leaders in statistical research, collaboration, and education, the Department of Statistics at Iowa State University offers students an education like no other. We are committed to our mission of developing and applying statistical methods, and proud of our award-winning students and faculty.
Journal Issue
Is Version Of
Versions
Series
Department
Statistics
Abstract

The first part of this thesis proposes new tests for high dimensional data. Chapter 2 proposes a high dimensional simultaneous test for regression coefficients in linear model. This test aims to test the significance of a large number of covariates simultaneously under the so-called "large p, small n" situations where the conventional F-test is no longer applicable. We derive the asymptotic distribution of the proposed test statistic under the high dimensional null hypothesis and various scenarios of the alternatives, which allow power evaluations. We further extend the result to linear model with factorial designs. We also evaluate the power of the F-test under very mild dimensionality. Chapter 3 considers a test for high dimensional means under sparsity and dependency. We propose a threshold test statistic, which is designed to detect sparse and faint signal. The asymptotic distribution is obtained for non normal and dependent data under the "large p, small n'' setting, where the data dimension can grow exponentially fast as the sample size grows. A maximum test, which maximizes the standardized threshold test statistic over a range of thresholds, is also proposed. It is shown that the maximum test can attain the optimal detection boundary, in the sense that asymptotically, all the tests would be powerless below the boundary.

The second part of this thesis is on analysis of variance (ANOVA) tests for treatment effects in longitudinal data with missing values. The treatment effects are modelled semiparametrically via a partially linear regression which is flexible in quantifying the time effects of treatments. The empirical likelihood is employed to formulate model-robust nonparametric ANOVA tests for treatment effects with respect to covariates, the nonparametric time-effect functions and interactions between covariates and time. The proposed tests can be readily modified for a variety of data and model combinations, that encompass parametric, semiparametric and nonparametric regression models; cross-sectional and longitudinal data, and with or without missing values.

Comments
Description
Keywords
Citation
Source
Subject Categories
Copyright
Sat Jan 01 00:00:00 UTC 2011