Decoupling Gradient-Like Learning Rules from Representations

Philip Thomas, Christoph Dann, Emma Brunskill
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4917-4925, 2018.

Abstract

In machine learning, learning often corresponds to changing the parameters of a parameterized function. A learning rule is an algorithm or mathematical expression that specifies precisely how the parameters should be changed. When creating a machine learning system, we must make two decisions: what representation should be used (i.e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions. In this paper we focus on gradient-like learning rules, wherein these two decisions are coupled in a subtle (and often unintentional) way. Using most learning rules, these two decisions are coupled in a subtle (and often unintentional) way. That is, using the same learning rule with two different representations that can represent the same sets of functions can result in two different outcomes. After arguing that this coupling is undesirable, particularly when using neural networks, we present a method for partially decoupling these two decisions for a broad class of gradient-like learning rules that span unsupervised learning, reinforcement learning, and supervised learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-thomas18a, title = {Decoupling Gradient-Like Learning Rules from Representations}, author = {Thomas, Philip and Dann, Christoph and Brunskill, Emma}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {4917--4925}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/thomas18a/thomas18a.pdf}, url = {http://proceedings.mlr.press/v80/thomas18a.html}, abstract = {In machine learning, learning often corresponds to changing the parameters of a parameterized function. A learning rule is an algorithm or mathematical expression that specifies precisely how the parameters should be changed. When creating a machine learning system, we must make two decisions: what representation should be used (i.e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions. In this paper we focus on gradient-like learning rules, wherein these two decisions are coupled in a subtle (and often unintentional) way. Using most learning rules, these two decisions are coupled in a subtle (and often unintentional) way. That is, using the same learning rule with two different representations that can represent the same sets of functions can result in two different outcomes. After arguing that this coupling is undesirable, particularly when using neural networks, we present a method for partially decoupling these two decisions for a broad class of gradient-like learning rules that span unsupervised learning, reinforcement learning, and supervised learning.} }
Endnote
%0 Conference Paper %T Decoupling Gradient-Like Learning Rules from Representations %A Philip Thomas %A Christoph Dann %A Emma Brunskill %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-thomas18a %I PMLR %P 4917--4925 %U http://proceedings.mlr.press/v80/thomas18a.html %V 80 %X In machine learning, learning often corresponds to changing the parameters of a parameterized function. A learning rule is an algorithm or mathematical expression that specifies precisely how the parameters should be changed. When creating a machine learning system, we must make two decisions: what representation should be used (i.e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions. In this paper we focus on gradient-like learning rules, wherein these two decisions are coupled in a subtle (and often unintentional) way. Using most learning rules, these two decisions are coupled in a subtle (and often unintentional) way. That is, using the same learning rule with two different representations that can represent the same sets of functions can result in two different outcomes. After arguing that this coupling is undesirable, particularly when using neural networks, we present a method for partially decoupling these two decisions for a broad class of gradient-like learning rules that span unsupervised learning, reinforcement learning, and supervised learning.
APA
Thomas, P., Dann, C. & Brunskill, E.. (2018). Decoupling Gradient-Like Learning Rules from Representations. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:4917-4925 Available from http://proceedings.mlr.press/v80/thomas18a.html.

Related Material