Exponentiated Gradient Meets Gradient Descent

Udaya Ghai, Elad Hazan, Yoram Singer
Proceedings of the 31st International Conference on Algorithmic Learning Theory, PMLR 117:386-407, 2020.

Abstract

The (stochastic) gradient descent and the multiplicative update method are probably the most popular algorithms in machine learning. We introduce and study a new regularization which provides a unification of the additive and multiplicative updates. This regularization is derived from an hyperbolic analogue of the entropy function, which we call hypentropy. It is motivated by a natural extension of the multiplicative update to negative numbers. The hypentropy has a natural spectral counterpart which we use to derive a family of matrix-based updates that bridge gradient methods and the multiplicative method for matrices. While the latter is only applicable to positive semi-definite matrices, the spectral hypentropy method can naturally be used with general rectangular matrices. We analyze the new family of updates by deriving tight regret bounds. We study empirically the applicability of the new update for settings such as multiclass learning, in which the parameters constitute a general rectangular matrix.

Cite this Paper


BibTeX
@InProceedings{pmlr-v117-ghai20a, title = {Exponentiated Gradient Meets Gradient Descent}, author = {Ghai, Udaya and Hazan, Elad and Singer, Yoram}, booktitle = {Proceedings of the 31st International Conference on Algorithmic Learning Theory}, pages = {386--407}, year = {2020}, editor = {Kontorovich, Aryeh and Neu, Gergely}, volume = {117}, series = {Proceedings of Machine Learning Research}, month = {08 Feb--11 Feb}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v117/ghai20a/ghai20a.pdf}, url = {https://proceedings.mlr.press/v117/ghai20a.html}, abstract = { The (stochastic) gradient descent and the multiplicative update method are probably the most popular algorithms in machine learning. We introduce and study a new regularization which provides a unification of the additive and multiplicative updates. This regularization is derived from an hyperbolic analogue of the entropy function, which we call hypentropy. It is motivated by a natural extension of the multiplicative update to negative numbers. The hypentropy has a natural spectral counterpart which we use to derive a family of matrix-based updates that bridge gradient methods and the multiplicative method for matrices. While the latter is only applicable to positive semi-definite matrices, the spectral hypentropy method can naturally be used with general rectangular matrices. We analyze the new family of updates by deriving tight regret bounds. We study empirically the applicability of the new update for settings such as multiclass learning, in which the parameters constitute a general rectangular matrix.} }
Endnote
%0 Conference Paper %T Exponentiated Gradient Meets Gradient Descent %A Udaya Ghai %A Elad Hazan %A Yoram Singer %B Proceedings of the 31st International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2020 %E Aryeh Kontorovich %E Gergely Neu %F pmlr-v117-ghai20a %I PMLR %P 386--407 %U https://proceedings.mlr.press/v117/ghai20a.html %V 117 %X The (stochastic) gradient descent and the multiplicative update method are probably the most popular algorithms in machine learning. We introduce and study a new regularization which provides a unification of the additive and multiplicative updates. This regularization is derived from an hyperbolic analogue of the entropy function, which we call hypentropy. It is motivated by a natural extension of the multiplicative update to negative numbers. The hypentropy has a natural spectral counterpart which we use to derive a family of matrix-based updates that bridge gradient methods and the multiplicative method for matrices. While the latter is only applicable to positive semi-definite matrices, the spectral hypentropy method can naturally be used with general rectangular matrices. We analyze the new family of updates by deriving tight regret bounds. We study empirically the applicability of the new update for settings such as multiclass learning, in which the parameters constitute a general rectangular matrix.
APA
Ghai, U., Hazan, E. & Singer, Y.. (2020). Exponentiated Gradient Meets Gradient Descent. Proceedings of the 31st International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 117:386-407 Available from https://proceedings.mlr.press/v117/ghai20a.html.

Related Material