Evaluation of Ensemble Methods in Imbalanced Regression Tasks
Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 74:129-140, 2017.
Ensemble methods are well known for providing an advantage over single models in a large range of data mining and machine learning tasks. Their benefits are commonly associated to the ability of reducing the bias and/or variance in learning tasks. Ensembles have been studied both for classification and regression tasks with uniform domain preferences. However, only for imbalanced classification these methods were thoroughly studied. In this paper we present an empirical study concerning the predictive ability of ensemble methods bagging and boosting in regression tasks, using 20 data sets with imbalanced distributions, and assuming non-uniform domain preferences. Results show that ensemble methods are capable of providing improvements in predictive ability towards under-represented values, and that this improvement influences the predictive ability of models concerning the average behaviour of the data. Results also show that the smaller data sets are prone to larger improvements in predictive accuracy and that no conclusion could be drawn when considering the percentage of rare cases alone.