Variants of RMSProp and Adagrad with Logarithmic Regret Bounds

Mahesh Chandra Mukkamala, Matthias Hein
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2545-2553, 2017.

Abstract

Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks. In this paper we have analyzed RMSProp, originally proposed for the training of deep neural networks, in the context of online convex optimization and show $\sqrt{T}$-type regret bounds. Moreover, we propose two variants SC-Adagrad and SC-RMSProp for which we show logarithmic regret bounds for strongly convex functions. Finally, we demonstrate in the experiments that these new variants outperform other adaptive gradient techniques or stochastic gradient descent in the optimization of strongly convex functions as well as in training of deep neural networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-mukkamala17a, title = {Variants of {RMSP}rop and {A}dagrad with Logarithmic Regret Bounds}, author = {Mahesh Chandra Mukkamala and Matthias Hein}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {2545--2553}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/mukkamala17a/mukkamala17a.pdf}, url = {https://proceedings.mlr.press/v70/mukkamala17a.html}, abstract = {Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks. In this paper we have analyzed RMSProp, originally proposed for the training of deep neural networks, in the context of online convex optimization and show $\sqrt{T}$-type regret bounds. Moreover, we propose two variants SC-Adagrad and SC-RMSProp for which we show logarithmic regret bounds for strongly convex functions. Finally, we demonstrate in the experiments that these new variants outperform other adaptive gradient techniques or stochastic gradient descent in the optimization of strongly convex functions as well as in training of deep neural networks.} }
Endnote
%0 Conference Paper %T Variants of RMSProp and Adagrad with Logarithmic Regret Bounds %A Mahesh Chandra Mukkamala %A Matthias Hein %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-mukkamala17a %I PMLR %P 2545--2553 %U https://proceedings.mlr.press/v70/mukkamala17a.html %V 70 %X Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks. In this paper we have analyzed RMSProp, originally proposed for the training of deep neural networks, in the context of online convex optimization and show $\sqrt{T}$-type regret bounds. Moreover, we propose two variants SC-Adagrad and SC-RMSProp for which we show logarithmic regret bounds for strongly convex functions. Finally, we demonstrate in the experiments that these new variants outperform other adaptive gradient techniques or stochastic gradient descent in the optimization of strongly convex functions as well as in training of deep neural networks.
APA
Mukkamala, M.C. & Hein, M.. (2017). Variants of RMSProp and Adagrad with Logarithmic Regret Bounds. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:2545-2553 Available from https://proceedings.mlr.press/v70/mukkamala17a.html.

Related Material