ReMix: Optimizing Data Mixtures for Large Scale Imitation Learning

Joey Hejna, Chethan Anand Bhateja, Yichen Jiang, Karl Pertsch, Dorsa Sadigh
Proceedings of The 8th Conference on Robot Learning, PMLR 270:145-164, 2025.

Abstract

Increasingly large robotics datasets are being collected to train larger foundation models in robotics. However, despite the fact that data selection has been of utmost importance to scaling in vision and natural language processing (NLP), little work in robotics has questioned what data such models should actually be trained on. In this work we investigate how to weigh different subsets or “domains” of robotics datasets during pre-training to maximize worst-case performance across all possible downstream domains using distributionally robust optimization (DRO). Unlike in NLP, we find that these methods are hard to apply out of the box due to varying action spaces and dynamics across robots. Our method, ReMix, employs early stopping and action normalization and discretization to counteract these issues. Through extensive experimentation on both the Bridge and OpenX datasets, we demonstrate that data curation can have an outsized impact on downstream performance. Specifically, domain weights learned by ReMix outperform uniform weights by over 40% on average and human-selected weights by over 20% on datasets used to train the RT-X models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-hejna25a, title = {ReMix: Optimizing Data Mixtures for Large Scale Imitation Learning}, author = {Hejna, Joey and Bhateja, Chethan Anand and Jiang, Yichen and Pertsch, Karl and Sadigh, Dorsa}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {145--164}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/hejna25a/hejna25a.pdf}, url = {https://proceedings.mlr.press/v270/hejna25a.html}, abstract = {Increasingly large robotics datasets are being collected to train larger foundation models in robotics. However, despite the fact that data selection has been of utmost importance to scaling in vision and natural language processing (NLP), little work in robotics has questioned what data such models should actually be trained on. In this work we investigate how to weigh different subsets or “domains” of robotics datasets during pre-training to maximize worst-case performance across all possible downstream domains using distributionally robust optimization (DRO). Unlike in NLP, we find that these methods are hard to apply out of the box due to varying action spaces and dynamics across robots. Our method, ReMix, employs early stopping and action normalization and discretization to counteract these issues. Through extensive experimentation on both the Bridge and OpenX datasets, we demonstrate that data curation can have an outsized impact on downstream performance. Specifically, domain weights learned by ReMix outperform uniform weights by over 40% on average and human-selected weights by over 20% on datasets used to train the RT-X models.} }
Endnote
%0 Conference Paper %T ReMix: Optimizing Data Mixtures for Large Scale Imitation Learning %A Joey Hejna %A Chethan Anand Bhateja %A Yichen Jiang %A Karl Pertsch %A Dorsa Sadigh %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-hejna25a %I PMLR %P 145--164 %U https://proceedings.mlr.press/v270/hejna25a.html %V 270 %X Increasingly large robotics datasets are being collected to train larger foundation models in robotics. However, despite the fact that data selection has been of utmost importance to scaling in vision and natural language processing (NLP), little work in robotics has questioned what data such models should actually be trained on. In this work we investigate how to weigh different subsets or “domains” of robotics datasets during pre-training to maximize worst-case performance across all possible downstream domains using distributionally robust optimization (DRO). Unlike in NLP, we find that these methods are hard to apply out of the box due to varying action spaces and dynamics across robots. Our method, ReMix, employs early stopping and action normalization and discretization to counteract these issues. Through extensive experimentation on both the Bridge and OpenX datasets, we demonstrate that data curation can have an outsized impact on downstream performance. Specifically, domain weights learned by ReMix outperform uniform weights by over 40% on average and human-selected weights by over 20% on datasets used to train the RT-X models.
APA
Hejna, J., Bhateja, C.A., Jiang, Y., Pertsch, K. & Sadigh, D.. (2025). ReMix: Optimizing Data Mixtures for Large Scale Imitation Learning. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:145-164 Available from https://proceedings.mlr.press/v270/hejna25a.html.

Related Material