Evaluation of Ensemble Methods in Imbalanced Regression Tasks

Nuno Moniz, Paula Branco, Luís Torgo
Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 74:129-140, 2017.

Abstract

Ensemble methods are well known for providing an advantage over single models in a large range of data mining and machine learning tasks. Their benefits are commonly associated to the ability of reducing the bias and/or variance in learning tasks. Ensembles have been studied both for classification and regression tasks with uniform domain preferences. However, only for imbalanced classification these methods were thoroughly studied. In this paper we present an empirical study concerning the predictive ability of ensemble methods bagging and boosting in regression tasks, using 20 data sets with imbalanced distributions, and assuming non-uniform domain preferences. Results show that ensemble methods are capable of providing improvements in predictive ability towards under-represented values, and that this improvement influences the predictive ability of models concerning the average behaviour of the data. Results also show that the smaller data sets are prone to larger improvements in predictive accuracy and that no conclusion could be drawn when considering the percentage of rare cases alone.

Cite this Paper


BibTeX
@InProceedings{pmlr-v74-moniz17a, title = {Evaluation of Ensemble Methods in Imbalanced Regression Tasks}, author = {Moniz, Nuno and Branco, Paula and Torgo, Luís}, booktitle = {Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications}, pages = {129--140}, year = {2017}, editor = {Luís Torgo, Paula Branco and Moniz, Nuno}, volume = {74}, series = {Proceedings of Machine Learning Research}, month = {22 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v74/moniz17a/moniz17a.pdf}, url = {https://proceedings.mlr.press/v74/moniz17a.html}, abstract = {Ensemble methods are well known for providing an advantage over single models in a large range of data mining and machine learning tasks. Their benefits are commonly associated to the ability of reducing the bias and/or variance in learning tasks. Ensembles have been studied both for classification and regression tasks with uniform domain preferences. However, only for imbalanced classification these methods were thoroughly studied. In this paper we present an empirical study concerning the predictive ability of ensemble methods bagging and boosting in regression tasks, using 20 data sets with imbalanced distributions, and assuming non-uniform domain preferences. Results show that ensemble methods are capable of providing improvements in predictive ability towards under-represented values, and that this improvement influences the predictive ability of models concerning the average behaviour of the data. Results also show that the smaller data sets are prone to larger improvements in predictive accuracy and that no conclusion could be drawn when considering the percentage of rare cases alone.} }
Endnote
%0 Conference Paper %T Evaluation of Ensemble Methods in Imbalanced Regression Tasks %A Nuno Moniz %A Paula Branco %A Luís Torgo %B Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications %C Proceedings of Machine Learning Research %D 2017 %E Paula Branco Luís Torgo %E Nuno Moniz %F pmlr-v74-moniz17a %I PMLR %P 129--140 %U https://proceedings.mlr.press/v74/moniz17a.html %V 74 %X Ensemble methods are well known for providing an advantage over single models in a large range of data mining and machine learning tasks. Their benefits are commonly associated to the ability of reducing the bias and/or variance in learning tasks. Ensembles have been studied both for classification and regression tasks with uniform domain preferences. However, only for imbalanced classification these methods were thoroughly studied. In this paper we present an empirical study concerning the predictive ability of ensemble methods bagging and boosting in regression tasks, using 20 data sets with imbalanced distributions, and assuming non-uniform domain preferences. Results show that ensemble methods are capable of providing improvements in predictive ability towards under-represented values, and that this improvement influences the predictive ability of models concerning the average behaviour of the data. Results also show that the smaller data sets are prone to larger improvements in predictive accuracy and that no conclusion could be drawn when considering the percentage of rare cases alone.
APA
Moniz, N., Branco, P. & Torgo, L.. (2017). Evaluation of Ensemble Methods in Imbalanced Regression Tasks. Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications, in Proceedings of Machine Learning Research 74:129-140 Available from https://proceedings.mlr.press/v74/moniz17a.html.

Related Material