Degree Type

Dissertation

Date of Award

2011

Degree Name

Doctor of Philosophy

Department

Statistics

First Advisor

Song X. Chen

Abstract

The first part of this thesis proposes new tests for high dimensional data. Chapter 2 proposes a high dimensional simultaneous test for regression coefficients in linear model. This test aims to test the significance of a large number of covariates simultaneously under the so-called "large p, small n" situations where the conventional F-test is no longer applicable. We derive the asymptotic distribution of the proposed test statistic under the high dimensional null hypothesis and various scenarios of the alternatives, which allow power evaluations. We further extend the result to linear model with factorial designs. We also evaluate the power of the F-test under very mild dimensionality. Chapter 3 considers a test for high dimensional means under sparsity and dependency. We propose a threshold test statistic, which is designed to detect sparse and faint signal. The asymptotic distribution is obtained for non normal and dependent data under the "large p, small n'' setting, where the data dimension can grow exponentially fast as the sample size grows. A maximum test, which maximizes the standardized threshold test statistic over a range of thresholds, is also proposed. It is shown that the maximum test can attain the optimal detection boundary, in the sense that asymptotically, all the tests would be powerless below the boundary.

The second part of this thesis is on analysis of variance (ANOVA) tests for treatment effects in longitudinal data with missing values. The treatment effects are modelled semiparametrically via a partially linear regression which is flexible in quantifying the time effects of treatments. The empirical likelihood is employed to formulate model-robust nonparametric ANOVA tests for treatment effects with respect to covariates, the nonparametric time-effect functions and interactions between covariates and time. The proposed tests can be readily modified for a variety of data and model combinations, that encompass parametric, semiparametric and nonparametric regression models; cross-sectional and longitudinal data, and with or without missing values.

DOI

https://doi.org/10.31274/etd-180810-224

Copyright Owner

Pingshou Zhong

Language

en

Date Available

2012-04-30

File Format

application/pdf

File Size

188 pages

Share

COinS