Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems

Tobias Enders, James Harrison, Marco Pavone, Maximilian Schiffer
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:1284-1296, 2023.

Abstract

We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator’s otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v211-enders23a, title = {Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems}, author = {Enders, Tobias and Harrison, James and Pavone, Marco and Schiffer, Maximilian}, booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference}, pages = {1284--1296}, year = {2023}, editor = {Matni, Nikolai and Morari, Manfred and Pappas, George J.}, volume = {211}, series = {Proceedings of Machine Learning Research}, month = {15--16 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v211/enders23a/enders23a.pdf}, url = {https://proceedings.mlr.press/v211/enders23a.html}, abstract = {We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator’s otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.} }
Endnote
%0 Conference Paper %T Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems %A Tobias Enders %A James Harrison %A Marco Pavone %A Maximilian Schiffer %B Proceedings of The 5th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2023 %E Nikolai Matni %E Manfred Morari %E George J. Pappas %F pmlr-v211-enders23a %I PMLR %P 1284--1296 %U https://proceedings.mlr.press/v211/enders23a.html %V 211 %X We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator’s otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.
APA
Enders, T., Harrison, J., Pavone, M. & Schiffer, M.. (2023). Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:1284-1296 Available from https://proceedings.mlr.press/v211/enders23a.html.

Related Material