Distributed Upload and Active Labeling for Resource-Constrained Fleet Learning

Oguzhan Akcin; Harsh Goel; Ruihan Zhao; Sandeep P. Chinchali

Distributed Upload and Active Labeling for Resource-Constrained Fleet Learning

Oguzhan Akcin, Harsh Goel, Ruihan Zhao, Sandeep P. Chinchali

Proceedings of The 9th Conference on Robot Learning, PMLR 305:3463-3482, 2025.

Abstract

In multi-robot systems, fleets are often deployed to collect data that improves the performance of machine learning models for downstream perception and planning. However, real-world robotic deployments generate vast amounts of data across diverse conditions, while only a small portion can be transmitted or labeled due to limited bandwidth, constrained onboard storage, and high annotation costs. To address these challenges, we propose Distributed Upload and Active Labeling (DUAL), a decentralized, two-stage data collection framework for resource-constrained robotic fleets. In the first stage, each robot independently selects a subset of its local observations to upload under storage and communication constraints. In the second stage, the cloud selects a subset of uploaded data to label, subject to a global annotation budget. We evaluate DUAL on classification tasks spanning multiple sensing modalities, as well as on RoadNet—a real-world dataset we collected from vehicle-mounted cameras for time and weather classification. We further validate our approach in a physical experiment using a Franka Emika Panda robot arm, where it learns to move a red cube to a green bowl. Finally, we test DUAL on trajectory prediction using the nuScenes autonomous driving dataset to assess generalization to complex prediction tasks. Across all settings, DUAL consistently outperforms state-of-the-art baselines, achieving up to 31.1% gain in classification accuracy and a 13% improvement in real-world robotics task completion rates.

Cite this Paper

BibTeX

@InProceedings{pmlr-v305-akcin25a,
  title = 	 {Distributed Upload and Active Labeling for Resource-Constrained Fleet Learning},
  author =       {Akcin, Oguzhan and Goel, Harsh and Zhao, Ruihan and Chinchali, Sandeep P.},
  booktitle = 	 {Proceedings of The 9th Conference on Robot Learning},
  pages = 	 {3463--3482},
  year = 	 {2025},
  editor = 	 {Lim, Joseph and Song, Shuran and Park, Hae-Won},
  volume = 	 {305},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {27--30 Sep},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v305/main/assets/akcin25a/akcin25a.pdf},
  url = 	 {https://proceedings.mlr.press/v305/akcin25a.html},
  abstract = 	 {In multi-robot systems, fleets are often deployed to collect data that improves the performance of machine learning models for downstream perception and planning. However, real-world robotic deployments generate vast amounts of data across diverse conditions, while only a small portion can be transmitted or labeled due to limited bandwidth, constrained onboard storage, and high annotation costs. To address these challenges, we propose Distributed Upload and Active Labeling (DUAL), a decentralized, two-stage data collection framework for resource-constrained robotic fleets. In the first stage, each robot independently selects a subset of its local observations to upload under storage and communication constraints. In the second stage, the cloud selects a subset of uploaded data to label, subject to a global annotation budget. We evaluate DUAL on classification tasks spanning multiple sensing modalities, as well as on RoadNet—a real-world dataset we collected from vehicle-mounted cameras for time and weather classification. We further validate our approach in a physical experiment using a Franka Emika Panda robot arm, where it learns to move a red cube to a green bowl. Finally, we test DUAL on trajectory prediction using the nuScenes autonomous driving dataset to assess generalization to complex prediction tasks. Across all settings, DUAL consistently outperforms state-of-the-art baselines, achieving up to 31.1% gain in classification accuracy and a 13% improvement in real-world robotics task completion rates.}
}

Endnote

%0 Conference Paper
%T Distributed Upload and Active Labeling for Resource-Constrained Fleet Learning
%A Oguzhan Akcin
%A Harsh Goel
%A Ruihan Zhao
%A Sandeep P. Chinchali
%B Proceedings of The 9th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Joseph Lim
%E Shuran Song
%E Hae-Won Park	
%F pmlr-v305-akcin25a
%I PMLR
%P 3463--3482
%U https://proceedings.mlr.press/v305/akcin25a.html
%V 305
%X In multi-robot systems, fleets are often deployed to collect data that improves the performance of machine learning models for downstream perception and planning. However, real-world robotic deployments generate vast amounts of data across diverse conditions, while only a small portion can be transmitted or labeled due to limited bandwidth, constrained onboard storage, and high annotation costs. To address these challenges, we propose Distributed Upload and Active Labeling (DUAL), a decentralized, two-stage data collection framework for resource-constrained robotic fleets. In the first stage, each robot independently selects a subset of its local observations to upload under storage and communication constraints. In the second stage, the cloud selects a subset of uploaded data to label, subject to a global annotation budget. We evaluate DUAL on classification tasks spanning multiple sensing modalities, as well as on RoadNet—a real-world dataset we collected from vehicle-mounted cameras for time and weather classification. We further validate our approach in a physical experiment using a Franka Emika Panda robot arm, where it learns to move a red cube to a green bowl. Finally, we test DUAL on trajectory prediction using the nuScenes autonomous driving dataset to assess generalization to complex prediction tasks. Across all settings, DUAL consistently outperforms state-of-the-art baselines, achieving up to 31.1% gain in classification accuracy and a 13% improvement in real-world robotics task completion rates.

APA

Akcin, O., Goel, H., Zhao, R. & Chinchali, S.P.. (2025). Distributed Upload and Active Labeling for Resource-Constrained Fleet Learning. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:3463-3482 Available from https://proceedings.mlr.press/v305/akcin25a.html.

Distributed Upload and Active Labeling for Resource-Constrained Fleet Learning

Abstract

Cite this Paper

Related Material