Publication Date


Series Number

Preprint # -1997-31


Risk bounds are derived for regression estimation based on model selection over an unrestricted number of models. While a large list of models provides more flexibility, significant selection bias may occur with model selection criteria like AIC. We incorporate a model complexity penalty term in AIC to handle selection bias. Resulting estimators are shown to achieve a trade-off among approximation error, estimation error and model complexity without prior knowledge about the true regression function. We demonstrate the adaptability of these estimators over full and sparse approximation function classes with different smoothness. For high-dimensional function estimation by tensor product splines we show that with number of knots and spline order adaptively selected, the least squares estimator converges at anticipated rates simultaneously for Sobolev classes with different interaction orders and smoothness parameters.


This preprint was published as Yuhong Yang, "Model Selection for Nonparametric Regression", Statistics Sinica (1999): 475-499.