Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective

Anirudh Vemula, Wen Sun, J. Bagnell
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:2926-2935, 2019.

Abstract

Black-box optimizers that explore in parameter space have often been shown to outperform more sophisticated action space exploration methods developed specifically for the reinforcement learning problem. We examine these black-box methods closely to identify situations in which they are worse than action space exploration methods and those in which they are superior. Through simple theoretical analyses, we prove that complexity of exploration in parameter space depends on the dimensionality of parameter space, while complexity of exploration in action space depends on both the dimensionality of action space and horizon length. This is also demonstrated empirically by comparing simple exploration methods on several model problems, including Contextual Bandit, Linear Regression and Reinforcement Learning in continuous control.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-vemula19a, title = {Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective}, author = {Vemula, Anirudh and Sun, Wen and Bagnell, J.}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {2926--2935}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/vemula19a/vemula19a.pdf}, url = {https://proceedings.mlr.press/v89/vemula19a.html}, abstract = {Black-box optimizers that explore in parameter space have often been shown to outperform more sophisticated action space exploration methods developed specifically for the reinforcement learning problem. We examine these black-box methods closely to identify situations in which they are worse than action space exploration methods and those in which they are superior. Through simple theoretical analyses, we prove that complexity of exploration in parameter space depends on the dimensionality of parameter space, while complexity of exploration in action space depends on both the dimensionality of action space and horizon length. This is also demonstrated empirically by comparing simple exploration methods on several model problems, including Contextual Bandit, Linear Regression and Reinforcement Learning in continuous control.} }
Endnote
%0 Conference Paper %T Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective %A Anirudh Vemula %A Wen Sun %A J. Bagnell %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-vemula19a %I PMLR %P 2926--2935 %U https://proceedings.mlr.press/v89/vemula19a.html %V 89 %X Black-box optimizers that explore in parameter space have often been shown to outperform more sophisticated action space exploration methods developed specifically for the reinforcement learning problem. We examine these black-box methods closely to identify situations in which they are worse than action space exploration methods and those in which they are superior. Through simple theoretical analyses, we prove that complexity of exploration in parameter space depends on the dimensionality of parameter space, while complexity of exploration in action space depends on both the dimensionality of action space and horizon length. This is also demonstrated empirically by comparing simple exploration methods on several model problems, including Contextual Bandit, Linear Regression and Reinforcement Learning in continuous control.
APA
Vemula, A., Sun, W. & Bagnell, J.. (2019). Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:2926-2935 Available from https://proceedings.mlr.press/v89/vemula19a.html.

Related Material