Preprint # - 1998-13
We study a problem of adaptive estimation of a conditional probability function in a pattern recognition setting. In many applications, for more flexibility, one may want to consider various estimation procedures targeted at different scenarios and/or under different assumptions. For example, when the feature dimension is high, to overcome the familiar curse of dimensionality one may seek a good parsimonious model among a number of candidates such as CART, neural nets and additive models. For such a situation, one wishes to have an automated final procedure that performs as well as the best candidate. In this work, we propose a method to combine a countable collection of procedures for estimating the conditional probability. We show that the combined procedure has a property that its statistical risk is bounded above by that of any of the procedure being considered plus a small penalty. Thus asymptotically, the strengths of the different estimation procedures are shared by the combined procedure. A simulation study shows the potential advantage of combining models compared with model selection.