Campus Units

Statistics

Document Type

Conference Proceeding

Conference

2017 Joint Statistical Meetings

Publication Version

Published Version

Publication Date

2017

Journal or Book Title

JSM Proceedings

First Page

636

Last Page

647

Conference Title

Statistics: It's Essential

Conference Date

July 29-August 3, 2017

City

Baltimore, Maryland

Abstract

Random forest (RF) methodology is one of the most popular machine learning techniques for prediction problems. In this article, we discuss some cases where random forests may suffer and propose a novel generalized RF method, namely regression-enhanced random forests (RERFs), that can improve on RFs by borrowing the strength of penalized parametric regression. The algorithm for constructing RERFs and selecting its tuning parameters is described. Both simulation study and real data examples show that RERFs have better predictive performance than RFs in important situations often encountered in practice. Moreover, RERFs may incorporate known relationships between the response and the predictors, and may give reliable predictions in extrapolation problems where predictions are required at points out of the domain of the training dataset. Strategies analogous to those described here can be used to improve other machine learning methods via combination with penalized parametric regression techniques.

Comments

This proceeding is published as Zhang, H., Nettleton, D., Zhu, Z. (2017). Regression-enhanced random forests. In JSM Proceedings, Section on Statistical Learning and Data Science. Alexandria, VA: American Statistical Association. 636–647. Posted with permission.

Copyright Owner

American Statistical Association

Language

en

File Format

application/pdf

Share

Article Location

 
COinS