VIBR: Learning View-Invariant Value Functions for Robust Visual Control

Tom Dupuis, Jaonary Rabarisoa, Quoc-Cuong Pham, David Filliat
Proceedings of The 2nd Conference on Lifelong Learning Agents, PMLR 232:658-682, 2023.

Abstract

End-to-end reinforcement learning on images showed significant progress in the recent years. Data-based approach leverage data augmentation and domain randomization while representation learning methods use auxiliary losses to learn task-relevant features. Yet, reinforcement still struggles in visually diverse environments full of distractions and spurious noise. In this work, we tackle the problem of robust visual control at its core and present VIBR (View-Invariant Bellman Residuals), a method that combines multi-view training and invariant prediction to reduce out-of-distribution (OOD) generalization gap for RL based visuomotor control. Our model-free approach improve baselines performances without the need of additional representation learning objectives and with limited additional computational cost. We show that VIBR outperforms existing methods on complex visuo-motor control environment with high visual perturbation. Our approach achieves state-of the-art results on the Distracting Control Suite benchmark, a challenging benchmark still not solved by current methods, where we evaluate the robustness to a number of visual perturbators, as well as OOD generalization and extrapolation capabilities.

Cite this Paper


BibTeX
@InProceedings{pmlr-v232-dupuis23a, title = {VIBR: Learning View-Invariant Value Functions for Robust Visual Control}, author = {Dupuis, Tom and Rabarisoa, Jaonary and Pham, Quoc-Cuong and Filliat, David}, booktitle = {Proceedings of The 2nd Conference on Lifelong Learning Agents}, pages = {658--682}, year = {2023}, editor = {Chandar, Sarath and Pascanu, Razvan and Sedghi, Hanie and Precup, Doina}, volume = {232}, series = {Proceedings of Machine Learning Research}, month = {22--25 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v232/dupuis23a/dupuis23a.pdf}, url = {https://proceedings.mlr.press/v232/dupuis23a.html}, abstract = {End-to-end reinforcement learning on images showed significant progress in the recent years. Data-based approach leverage data augmentation and domain randomization while representation learning methods use auxiliary losses to learn task-relevant features. Yet, reinforcement still struggles in visually diverse environments full of distractions and spurious noise. In this work, we tackle the problem of robust visual control at its core and present VIBR (View-Invariant Bellman Residuals), a method that combines multi-view training and invariant prediction to reduce out-of-distribution (OOD) generalization gap for RL based visuomotor control. Our model-free approach improve baselines performances without the need of additional representation learning objectives and with limited additional computational cost. We show that VIBR outperforms existing methods on complex visuo-motor control environment with high visual perturbation. Our approach achieves state-of the-art results on the Distracting Control Suite benchmark, a challenging benchmark still not solved by current methods, where we evaluate the robustness to a number of visual perturbators, as well as OOD generalization and extrapolation capabilities.} }
Endnote
%0 Conference Paper %T VIBR: Learning View-Invariant Value Functions for Robust Visual Control %A Tom Dupuis %A Jaonary Rabarisoa %A Quoc-Cuong Pham %A David Filliat %B Proceedings of The 2nd Conference on Lifelong Learning Agents %C Proceedings of Machine Learning Research %D 2023 %E Sarath Chandar %E Razvan Pascanu %E Hanie Sedghi %E Doina Precup %F pmlr-v232-dupuis23a %I PMLR %P 658--682 %U https://proceedings.mlr.press/v232/dupuis23a.html %V 232 %X End-to-end reinforcement learning on images showed significant progress in the recent years. Data-based approach leverage data augmentation and domain randomization while representation learning methods use auxiliary losses to learn task-relevant features. Yet, reinforcement still struggles in visually diverse environments full of distractions and spurious noise. In this work, we tackle the problem of robust visual control at its core and present VIBR (View-Invariant Bellman Residuals), a method that combines multi-view training and invariant prediction to reduce out-of-distribution (OOD) generalization gap for RL based visuomotor control. Our model-free approach improve baselines performances without the need of additional representation learning objectives and with limited additional computational cost. We show that VIBR outperforms existing methods on complex visuo-motor control environment with high visual perturbation. Our approach achieves state-of the-art results on the Distracting Control Suite benchmark, a challenging benchmark still not solved by current methods, where we evaluate the robustness to a number of visual perturbators, as well as OOD generalization and extrapolation capabilities.
APA
Dupuis, T., Rabarisoa, J., Pham, Q. & Filliat, D.. (2023). VIBR: Learning View-Invariant Value Functions for Robust Visual Control. Proceedings of The 2nd Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 232:658-682 Available from https://proceedings.mlr.press/v232/dupuis23a.html.

Related Material