How to Stay Curious while avoiding Noisy TVs using Aleatoric Uncertainty Estimation

Augustine Mavor-Parker, Kimberly Young, Caswell Barry, Lewis Griffin
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:15220-15240, 2022.

Abstract

When extrinsic rewards are sparse, artificial agents struggle to explore an environment. Curiosity, implemented as an intrinsic reward for prediction errors, can improve exploration but it is known to fail when faced with action-dependent noise sources (‘noisy TVs’). In an attempt to make exploring agents robust to Noisy TVs, we present a simple solution: aleatoric mapping agents (AMAs). AMAs are a novel form of curiosity that explicitly ascertain which state transitions of the environment are unpredictable, even if those dynamics are induced by the actions of the agent. This is achieved by generating separate forward predictions for the mean and aleatoric uncertainty of future states, with the aim of reducing intrinsic rewards for those transitions that are unpredictable. We demonstrate that in a range of environments AMAs are able to circumvent action-dependent stochastic traps that immobilise conventional curiosity driven agents. Furthermore, we demonstrate empirically that other common exploration approaches—previously thought to be immune to agent-induced randomness—can be trapped by stochastic dynamics.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-mavor-parker22a, title = {How to Stay Curious while avoiding Noisy {TV}s using Aleatoric Uncertainty Estimation}, author = {Mavor-Parker, Augustine and Young, Kimberly and Barry, Caswell and Griffin, Lewis}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {15220--15240}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/mavor-parker22a/mavor-parker22a.pdf}, url = {https://proceedings.mlr.press/v162/mavor-parker22a.html}, abstract = {When extrinsic rewards are sparse, artificial agents struggle to explore an environment. Curiosity, implemented as an intrinsic reward for prediction errors, can improve exploration but it is known to fail when faced with action-dependent noise sources (‘noisy TVs’). In an attempt to make exploring agents robust to Noisy TVs, we present a simple solution: aleatoric mapping agents (AMAs). AMAs are a novel form of curiosity that explicitly ascertain which state transitions of the environment are unpredictable, even if those dynamics are induced by the actions of the agent. This is achieved by generating separate forward predictions for the mean and aleatoric uncertainty of future states, with the aim of reducing intrinsic rewards for those transitions that are unpredictable. We demonstrate that in a range of environments AMAs are able to circumvent action-dependent stochastic traps that immobilise conventional curiosity driven agents. Furthermore, we demonstrate empirically that other common exploration approaches—previously thought to be immune to agent-induced randomness—can be trapped by stochastic dynamics.} }
Endnote
%0 Conference Paper %T How to Stay Curious while avoiding Noisy TVs using Aleatoric Uncertainty Estimation %A Augustine Mavor-Parker %A Kimberly Young %A Caswell Barry %A Lewis Griffin %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-mavor-parker22a %I PMLR %P 15220--15240 %U https://proceedings.mlr.press/v162/mavor-parker22a.html %V 162 %X When extrinsic rewards are sparse, artificial agents struggle to explore an environment. Curiosity, implemented as an intrinsic reward for prediction errors, can improve exploration but it is known to fail when faced with action-dependent noise sources (‘noisy TVs’). In an attempt to make exploring agents robust to Noisy TVs, we present a simple solution: aleatoric mapping agents (AMAs). AMAs are a novel form of curiosity that explicitly ascertain which state transitions of the environment are unpredictable, even if those dynamics are induced by the actions of the agent. This is achieved by generating separate forward predictions for the mean and aleatoric uncertainty of future states, with the aim of reducing intrinsic rewards for those transitions that are unpredictable. We demonstrate that in a range of environments AMAs are able to circumvent action-dependent stochastic traps that immobilise conventional curiosity driven agents. Furthermore, we demonstrate empirically that other common exploration approaches—previously thought to be immune to agent-induced randomness—can be trapped by stochastic dynamics.
APA
Mavor-Parker, A., Young, K., Barry, C. & Griffin, L.. (2022). How to Stay Curious while avoiding Noisy TVs using Aleatoric Uncertainty Estimation. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:15220-15240 Available from https://proceedings.mlr.press/v162/mavor-parker22a.html.

Related Material