Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective

Haoran He, Peilin Wu, Chenjia Bai, Hang Lai, Lingxiao Wang, Ling Pan, Xiaolin Hu, Weinan Zhang
Proceedings of The 8th Conference on Robot Learning, PMLR 270:473-495, 2025.

Abstract

Reinforcement Learning (RL) has recently achieved remarkable success in robotic control. However, most works in RL operate in simulated environments where privileged knowledge (e.g., dynamics, surroundings, terrains) is readily available. Conversely, in real-world scenarios, robot agents usually rely solely on local states (e.g., proprioceptive feedback of robot joints) to select actions, leading to a significant sim-to-real gap. Existing methods address this gap by either gradually reducing the reliance on privileged knowledge or performing a two-stage policy imitation. However, we argue that these methods are limited in their ability to fully leverage the available privileged knowledge, resulting in suboptimal performance. In this paper, we formulate the sim-to-real gap as an information bottleneck problem and therefore propose a novel privileged knowledge distillation method called the Historical Information Bottleneck (HIB). In particular, HIB learns a privileged knowledge representation from historical trajectories by capturing the underlying changeable dynamic information. Theoretical analysis shows that the learned privileged knowledge representation helps reduce the value discrepancy between the oracle and learned policies. Empirical experiments on both simulated and real-world tasks demonstrate that HIB yields improved generalizability compared to previous methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-he25a, title = {Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective}, author = {He, Haoran and Wu, Peilin and Bai, Chenjia and Lai, Hang and Wang, Lingxiao and Pan, Ling and Hu, Xiaolin and Zhang, Weinan}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {473--495}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/he25a/he25a.pdf}, url = {https://proceedings.mlr.press/v270/he25a.html}, abstract = {Reinforcement Learning (RL) has recently achieved remarkable success in robotic control. However, most works in RL operate in simulated environments where privileged knowledge (e.g., dynamics, surroundings, terrains) is readily available. Conversely, in real-world scenarios, robot agents usually rely solely on local states (e.g., proprioceptive feedback of robot joints) to select actions, leading to a significant sim-to-real gap. Existing methods address this gap by either gradually reducing the reliance on privileged knowledge or performing a two-stage policy imitation. However, we argue that these methods are limited in their ability to fully leverage the available privileged knowledge, resulting in suboptimal performance. In this paper, we formulate the sim-to-real gap as an information bottleneck problem and therefore propose a novel privileged knowledge distillation method called the Historical Information Bottleneck (HIB). In particular, HIB learns a privileged knowledge representation from historical trajectories by capturing the underlying changeable dynamic information. Theoretical analysis shows that the learned privileged knowledge representation helps reduce the value discrepancy between the oracle and learned policies. Empirical experiments on both simulated and real-world tasks demonstrate that HIB yields improved generalizability compared to previous methods.} }
Endnote
%0 Conference Paper %T Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective %A Haoran He %A Peilin Wu %A Chenjia Bai %A Hang Lai %A Lingxiao Wang %A Ling Pan %A Xiaolin Hu %A Weinan Zhang %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-he25a %I PMLR %P 473--495 %U https://proceedings.mlr.press/v270/he25a.html %V 270 %X Reinforcement Learning (RL) has recently achieved remarkable success in robotic control. However, most works in RL operate in simulated environments where privileged knowledge (e.g., dynamics, surroundings, terrains) is readily available. Conversely, in real-world scenarios, robot agents usually rely solely on local states (e.g., proprioceptive feedback of robot joints) to select actions, leading to a significant sim-to-real gap. Existing methods address this gap by either gradually reducing the reliance on privileged knowledge or performing a two-stage policy imitation. However, we argue that these methods are limited in their ability to fully leverage the available privileged knowledge, resulting in suboptimal performance. In this paper, we formulate the sim-to-real gap as an information bottleneck problem and therefore propose a novel privileged knowledge distillation method called the Historical Information Bottleneck (HIB). In particular, HIB learns a privileged knowledge representation from historical trajectories by capturing the underlying changeable dynamic information. Theoretical analysis shows that the learned privileged knowledge representation helps reduce the value discrepancy between the oracle and learned policies. Empirical experiments on both simulated and real-world tasks demonstrate that HIB yields improved generalizability compared to previous methods.
APA
He, H., Wu, P., Bai, C., Lai, H., Wang, L., Pan, L., Hu, X. & Zhang, W.. (2025). Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:473-495 Available from https://proceedings.mlr.press/v270/he25a.html.

Related Material