Gradient Boosting on Stochastic Data Streams

Hanzhang Hu, Wen Sun, Arun Venkatraman, Martial Hebert, Andrew Bagnell
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:595-603, 2017.

Abstract

Boosting is a popular ensemble algorithm that generates more powerful learners by linearly combining base models from a simpler hypothesis class. In this work, we investigate the problem of adapting batch gradient boosting for minimizing convex loss functions to online setting where the loss at each iteration is i.i.d sampled from an unknown distribution. To generalize from batch to online, we first introduce the definition of online weak learning edge with which for strongly convex and smooth loss functions, we present an algorithm, Streaming Gradient Boosting (SGB) with exponential shrinkage guarantees in the number of weak learners. We further present an adaptation of SGB to optimize non-smooth loss functions, for which we derive a $O(\ln(N)/N)$ convergence rate. We also show that our analysis can extend to adversarial online learning setting under a stronger assumption that the online weak learning edge will hold in adversarial setting. We finally demonstrate experimental results showing that in practice our algorithms can achieve competitive results as classic gradient boosting while using less computation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-hu17a, title = {{Gradient Boosting on Stochastic Data Streams}}, author = {Hu, Hanzhang and Sun, Wen and Venkatraman, Arun and Hebert, Martial and Bagnell, Andrew}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {595--603}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/hu17a/hu17a.pdf}, url = {https://proceedings.mlr.press/v54/hu17a.html}, abstract = {Boosting is a popular ensemble algorithm that generates more powerful learners by linearly combining base models from a simpler hypothesis class. In this work, we investigate the problem of adapting batch gradient boosting for minimizing convex loss functions to online setting where the loss at each iteration is i.i.d sampled from an unknown distribution. To generalize from batch to online, we first introduce the definition of online weak learning edge with which for strongly convex and smooth loss functions, we present an algorithm, Streaming Gradient Boosting (SGB) with exponential shrinkage guarantees in the number of weak learners. We further present an adaptation of SGB to optimize non-smooth loss functions, for which we derive a $O(\ln(N)/N)$ convergence rate. We also show that our analysis can extend to adversarial online learning setting under a stronger assumption that the online weak learning edge will hold in adversarial setting. We finally demonstrate experimental results showing that in practice our algorithms can achieve competitive results as classic gradient boosting while using less computation.} }
Endnote
%0 Conference Paper %T Gradient Boosting on Stochastic Data Streams %A Hanzhang Hu %A Wen Sun %A Arun Venkatraman %A Martial Hebert %A Andrew Bagnell %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-hu17a %I PMLR %P 595--603 %U https://proceedings.mlr.press/v54/hu17a.html %V 54 %X Boosting is a popular ensemble algorithm that generates more powerful learners by linearly combining base models from a simpler hypothesis class. In this work, we investigate the problem of adapting batch gradient boosting for minimizing convex loss functions to online setting where the loss at each iteration is i.i.d sampled from an unknown distribution. To generalize from batch to online, we first introduce the definition of online weak learning edge with which for strongly convex and smooth loss functions, we present an algorithm, Streaming Gradient Boosting (SGB) with exponential shrinkage guarantees in the number of weak learners. We further present an adaptation of SGB to optimize non-smooth loss functions, for which we derive a $O(\ln(N)/N)$ convergence rate. We also show that our analysis can extend to adversarial online learning setting under a stronger assumption that the online weak learning edge will hold in adversarial setting. We finally demonstrate experimental results showing that in practice our algorithms can achieve competitive results as classic gradient boosting while using less computation.
APA
Hu, H., Sun, W., Venkatraman, A., Hebert, M. & Bagnell, A.. (2017). Gradient Boosting on Stochastic Data Streams. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:595-603 Available from https://proceedings.mlr.press/v54/hu17a.html.

Related Material