Selective Dyna-Style Planning Under Limited Model Capacity

Zaheer Abbas, Samuel Sokota, Erin Talvitie, Martha White
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1-10, 2020.

Abstract

In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful. An effective selective planning mechanism requires estimating predictive uncertainty, which arises out of aleatoric uncertainty, parameter uncertainty, and model inadequacy, among other sources. Prior work has focused on parameter uncertainty for selective planning. In this work, we emphasize the importance of model inadequacy. We show that heteroscedastic regression can signal predictive uncertainty arising from model inadequacy that is complementary to that which is detected by methods designed for parameter uncertainty, indicating that considering both parameter uncertainty and model inadequacy may be a more promising direction for effective selective planning than either in isolation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-abbas20a, title = {Selective Dyna-Style Planning Under Limited Model Capacity}, author = {Abbas, Zaheer and Sokota, Samuel and Talvitie, Erin and White, Martha}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1--10}, year = {2020}, editor = {Hal Daumé III and Aarti Singh}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/abbas20a/abbas20a.pdf}, url = { http://proceedings.mlr.press/v119/abbas20a.html }, abstract = {In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful. An effective selective planning mechanism requires estimating predictive uncertainty, which arises out of aleatoric uncertainty, parameter uncertainty, and model inadequacy, among other sources. Prior work has focused on parameter uncertainty for selective planning. In this work, we emphasize the importance of model inadequacy. We show that heteroscedastic regression can signal predictive uncertainty arising from model inadequacy that is complementary to that which is detected by methods designed for parameter uncertainty, indicating that considering both parameter uncertainty and model inadequacy may be a more promising direction for effective selective planning than either in isolation.} }
Endnote
%0 Conference Paper %T Selective Dyna-Style Planning Under Limited Model Capacity %A Zaheer Abbas %A Samuel Sokota %A Erin Talvitie %A Martha White %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-abbas20a %I PMLR %P 1--10 %U http://proceedings.mlr.press/v119/abbas20a.html %V 119 %X In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful. An effective selective planning mechanism requires estimating predictive uncertainty, which arises out of aleatoric uncertainty, parameter uncertainty, and model inadequacy, among other sources. Prior work has focused on parameter uncertainty for selective planning. In this work, we emphasize the importance of model inadequacy. We show that heteroscedastic regression can signal predictive uncertainty arising from model inadequacy that is complementary to that which is detected by methods designed for parameter uncertainty, indicating that considering both parameter uncertainty and model inadequacy may be a more promising direction for effective selective planning than either in isolation.
APA
Abbas, Z., Sokota, S., Talvitie, E. & White, M.. (2020). Selective Dyna-Style Planning Under Limited Model Capacity. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1-10 Available from http://proceedings.mlr.press/v119/abbas20a.html .

Related Material