Accelerating Gradient Boosting Machines

Haihao Lu, Sai Praneeth Karimireddy, Natalia Ponomareva, Vahab Mirrokni
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:516-526, 2020.

Abstract

Gradient Boosting Machine (GBM) introduced by \cite{friedman2001greedy} is a widely popular ensembling technique and is routinely used in competitions such as Kaggle and the KDDCup \citep{chen2016xgboost}. In this work, we propose an Accelerated Gradient Boosting Machine (AGBM) by incorporating Nesterov’s acceleration techniques into the design of GBM. The difficulty in accelerating GBM lies in the fact that weak (inexact) learners are commonly used, and therefore, with naive application, the errors can accumulate in the momentum term. To overcome it, we design a “corrected pseudo residual” that serves as a new target for fitting a weak learner, in order to perform the z-update. Thus, we are able to derive novel computational guarantees for AGBM. This is the first GBM type of algorithm with a theoretically-justified accelerated convergence rate.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-lu20a, title = {Accelerating Gradient Boosting Machines}, author = {Lu, Haihao and Karimireddy, Sai Praneeth and Ponomareva, Natalia and Mirrokni, Vahab}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {516--526}, year = {2020}, editor = {Chiappa, Silvia and Calandra, Roberto}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/lu20a/lu20a.pdf}, url = {https://proceedings.mlr.press/v108/lu20a.html}, abstract = {Gradient Boosting Machine (GBM) introduced by \cite{friedman2001greedy} is a widely popular ensembling technique and is routinely used in competitions such as Kaggle and the KDDCup \citep{chen2016xgboost}. In this work, we propose an Accelerated Gradient Boosting Machine (AGBM) by incorporating Nesterov’s acceleration techniques into the design of GBM. The difficulty in accelerating GBM lies in the fact that weak (inexact) learners are commonly used, and therefore, with naive application, the errors can accumulate in the momentum term. To overcome it, we design a “corrected pseudo residual” that serves as a new target for fitting a weak learner, in order to perform the z-update. Thus, we are able to derive novel computational guarantees for AGBM. This is the first GBM type of algorithm with a theoretically-justified accelerated convergence rate. } }
Endnote
%0 Conference Paper %T Accelerating Gradient Boosting Machines %A Haihao Lu %A Sai Praneeth Karimireddy %A Natalia Ponomareva %A Vahab Mirrokni %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-lu20a %I PMLR %P 516--526 %U https://proceedings.mlr.press/v108/lu20a.html %V 108 %X Gradient Boosting Machine (GBM) introduced by \cite{friedman2001greedy} is a widely popular ensembling technique and is routinely used in competitions such as Kaggle and the KDDCup \citep{chen2016xgboost}. In this work, we propose an Accelerated Gradient Boosting Machine (AGBM) by incorporating Nesterov’s acceleration techniques into the design of GBM. The difficulty in accelerating GBM lies in the fact that weak (inexact) learners are commonly used, and therefore, with naive application, the errors can accumulate in the momentum term. To overcome it, we design a “corrected pseudo residual” that serves as a new target for fitting a weak learner, in order to perform the z-update. Thus, we are able to derive novel computational guarantees for AGBM. This is the first GBM type of algorithm with a theoretically-justified accelerated convergence rate.
APA
Lu, H., Karimireddy, S.P., Ponomareva, N. & Mirrokni, V.. (2020). Accelerating Gradient Boosting Machines. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:516-526 Available from https://proceedings.mlr.press/v108/lu20a.html.

Related Material