Active Domain Randomization

Bhairav Mehta; Manfred Diaz; Florian Golemo; Christopher J. Pal; Liam Paull

Active Domain Randomization

Bhairav Mehta, Manfred Diaz, Florian Golemo, Christopher J. Pal, Liam Paull

Proceedings of the Conference on Robot Learning, PMLR 100:1162-1176, 2020.

Abstract

Domain randomization is a popular technique for improving domain transfer, often used in a zero-shot setting when the target domain is unknown or cannot easily be used for training. In this work, we empirically examine the effects of domain randomization on agent generalization. Our experiments show that domain randomization may lead to suboptimal, high-variance policies, which we attribute to the uniform sampling of environment parameters. We propose Active Domain Randomization, a novel algorithm that learns a parameter sampling strategy. Our method looks for the most informative environment variations within the given randomization ranges by leveraging the discrepancies of policy rollouts in randomized and reference environment instances. We find that training more frequently on these instances leads to better overall agent generalization. Our experiments across various physics-based simulated and real-robot tasks show that this enhancement leads to more robust, consistent policies.

Cite this Paper

BibTeX


@InProceedings{pmlr-v100-mehta20a,
  title = 	 {Active Domain Randomization},
  author =       {Mehta, Bhairav and Diaz, Manfred and Golemo, Florian and Pal, Christopher J. and Paull, Liam},
  booktitle = 	 {Proceedings of the Conference on Robot Learning},
  pages = 	 {1162--1176},
  year = 	 {2020},
  editor = 	 {Kaelbling, Leslie Pack and Kragic, Danica and Sugiura, Komei},
  volume = 	 {100},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Oct--01 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v100/mehta20a/mehta20a.pdf},
  url = 	 {https://proceedings.mlr.press/v100/mehta20a.html},
  abstract = 	 {Domain randomization is a popular technique for improving domain transfer, often used in a zero-shot setting when the target domain is unknown or cannot easily be used for training. In this work, we empirically examine the effects of domain randomization on agent generalization. Our experiments show that domain randomization may lead to suboptimal, high-variance policies, which we attribute to the uniform sampling of environment parameters. We propose Active Domain Randomization, a novel algorithm that learns a parameter sampling strategy. Our method looks for the most informative environment variations within the given randomization ranges by leveraging the discrepancies of policy rollouts in randomized and reference environment instances. We find that training more frequently on these instances leads to better overall agent generalization. Our experiments across various physics-based simulated and real-robot tasks show that this enhancement leads to more robust, consistent policies.}
}

Endnote

%0 Conference Paper
%T Active Domain Randomization
%A Bhairav Mehta
%A Manfred Diaz
%A Florian Golemo
%A Christopher J. Pal
%A Liam Paull
%B Proceedings of the Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Leslie Pack Kaelbling
%E Danica Kragic
%E Komei Sugiura	
%F pmlr-v100-mehta20a
%I PMLR
%P 1162--1176
%U https://proceedings.mlr.press/v100/mehta20a.html
%V 100
%X Domain randomization is a popular technique for improving domain transfer, often used in a zero-shot setting when the target domain is unknown or cannot easily be used for training. In this work, we empirically examine the effects of domain randomization on agent generalization. Our experiments show that domain randomization may lead to suboptimal, high-variance policies, which we attribute to the uniform sampling of environment parameters. We propose Active Domain Randomization, a novel algorithm that learns a parameter sampling strategy. Our method looks for the most informative environment variations within the given randomization ranges by leveraging the discrepancies of policy rollouts in randomized and reference environment instances. We find that training more frequently on these instances leads to better overall agent generalization. Our experiments across various physics-based simulated and real-robot tasks show that this enhancement leads to more robust, consistent policies.

APA


Mehta, B., Diaz, M., Golemo, F., Pal, C.J. & Paull, L.. (2020). Active Domain Randomization. Proceedings of the Conference on Robot Learning, in Proceedings of Machine Learning Research 100:1162-1176 Available from https://proceedings.mlr.press/v100/mehta20a.html.

Related Material

Download PDF