Cross-Domain Policy Transfer by Representation Alignment via Multi-Domain Behavioral Cloning

Hayato Watahiki; Ryo Iwase; Ryosuke Unno; Yoshimasa Tsuruoka

Cross-Domain Policy Transfer by Representation Alignment via Multi-Domain Behavioral Cloning

Hayato Watahiki, Ryo Iwase, Ryosuke Unno, Yoshimasa Tsuruoka

Proceedings of The 3rd Conference on Lifelong Learning Agents, PMLR 274:301-323, 2025.

Abstract

Transferring learned skills across diverse situations remains a fundamental challenge for autonomous agents, particularly when agents are not allowed to interact with an exact target setup. While prior approaches have predominantly focused on learning domain translation, they often struggle with handling significant domain gaps or out-of-distribution tasks. In this paper, we present a simple approach for cross-domain policy transfer that learns a shared latent representation across domains and a common abstract policy on top of it. Our approach leverages multi-domain behavioral cloning on unaligned trajectories of proxy tasks and employs maximum mean discrepancy (MMD) as a regularization term to encourage cross-domain alignment. The MMD regularization better preserves structures of latent state distributions than commonly used domain-discriminative distribution matching, leading to higher transfer performance. Moreover, our approach involves training only one multi-domain policy, which makes extension easier than existing methods. Empirical evaluations demonstrate the efficacy of our method across various domain shifts, especially in scenarios where exact domain translation is challenging, such as cross-morphology or cross-viewpoint settings. Our ablation studies further reveal that multi-domain behavioral cloning implicitly contributes to representation alignment alongside domain-adversarial regularization.

Cite this Paper

BibTeX

@InProceedings{pmlr-v274-watahiki25a,
  title = 	 {Cross-Domain Policy Transfer by Representation Alignment via Multi-Domain Behavioral Cloning},
  author =       {Watahiki, Hayato and Iwase, Ryo and Unno, Ryosuke and Tsuruoka, Yoshimasa},
  booktitle = 	 {Proceedings of The 3rd Conference on Lifelong Learning Agents},
  pages = 	 {301--323},
  year = 	 {2025},
  editor = 	 {Lomonaco, Vincenzo and Melacci, Stefano and Tuytelaars, Tinne and Chandar, Sarath and Pascanu, Razvan},
  volume = 	 {274},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jul--01 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v274/main/assets/watahiki25a/watahiki25a.pdf},
  url = 	 {https://proceedings.mlr.press/v274/watahiki25a.html},
  abstract = 	 {Transferring learned skills across diverse situations remains a fundamental challenge for autonomous agents, particularly when agents are not allowed to interact with an exact target setup. While prior approaches have predominantly focused on learning domain translation, they often struggle with handling significant domain gaps or out-of-distribution tasks. In this paper, we present a simple approach for cross-domain policy transfer that learns a shared latent representation across domains and a common abstract policy on top of it. Our approach leverages multi-domain behavioral cloning on unaligned trajectories of proxy tasks and employs maximum mean discrepancy (MMD) as a regularization term to encourage cross-domain alignment. The MMD regularization better preserves structures of latent state distributions than commonly used domain-discriminative distribution matching, leading to higher transfer performance. Moreover, our approach involves training only one multi-domain policy, which makes extension easier than existing methods. Empirical evaluations demonstrate the efficacy of our method across various domain shifts, especially in scenarios where exact domain translation is challenging, such as cross-morphology or cross-viewpoint settings. Our ablation studies further reveal that multi-domain behavioral cloning implicitly contributes to representation alignment alongside domain-adversarial regularization.}
}

Endnote

%0 Conference Paper
%T Cross-Domain Policy Transfer by Representation Alignment via Multi-Domain Behavioral Cloning
%A Hayato Watahiki
%A Ryo Iwase
%A Ryosuke Unno
%A Yoshimasa Tsuruoka
%B Proceedings of The 3rd Conference on Lifelong Learning Agents
%C Proceedings of Machine Learning Research
%D 2025
%E Vincenzo Lomonaco
%E Stefano Melacci
%E Tinne Tuytelaars
%E Sarath Chandar
%E Razvan Pascanu	
%F pmlr-v274-watahiki25a
%I PMLR
%P 301--323
%U https://proceedings.mlr.press/v274/watahiki25a.html
%V 274
%X Transferring learned skills across diverse situations remains a fundamental challenge for autonomous agents, particularly when agents are not allowed to interact with an exact target setup. While prior approaches have predominantly focused on learning domain translation, they often struggle with handling significant domain gaps or out-of-distribution tasks. In this paper, we present a simple approach for cross-domain policy transfer that learns a shared latent representation across domains and a common abstract policy on top of it. Our approach leverages multi-domain behavioral cloning on unaligned trajectories of proxy tasks and employs maximum mean discrepancy (MMD) as a regularization term to encourage cross-domain alignment. The MMD regularization better preserves structures of latent state distributions than commonly used domain-discriminative distribution matching, leading to higher transfer performance. Moreover, our approach involves training only one multi-domain policy, which makes extension easier than existing methods. Empirical evaluations demonstrate the efficacy of our method across various domain shifts, especially in scenarios where exact domain translation is challenging, such as cross-morphology or cross-viewpoint settings. Our ablation studies further reveal that multi-domain behavioral cloning implicitly contributes to representation alignment alongside domain-adversarial regularization.

APA

Watahiki, H., Iwase, R., Unno, R. & Tsuruoka, Y.. (2025). Cross-Domain Policy Transfer by Representation Alignment via Multi-Domain Behavioral Cloning. Proceedings of The 3rd Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 274:301-323 Available from https://proceedings.mlr.press/v274/watahiki25a.html.

Related Material

Download PDF