The benefits of sharing: a cloud-aided performance-driven framework to learn optimal feedback policies

Laura Ferrarotti, Valentina Breschi, Alberto Bemporad
Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR 144:87-98, 2021.

Abstract

Mass-produced self-regulating systems are constructed and calibrated to be nominally the same and have similar goals. When several of them can share information with the cloud, their similarities can be exploited to improve the design of individual control policies. In this multi-agent framework, we aim at exploiting these similarities and the connection to the cloud to solve a sharing-based control policy optimization, so as to leverage on information provided by “trustworthy” agents. In this paper, we propose to combine the optimal policy search method introduced in (Ferrarotti and Bemporad, 2019) with the Alternating Direction Method of Multipliers, by relying on weighted surrogate of the experiences of each device, shared with the cloud. A preliminary example shows the effectiveness of the proposed sharing-based method, that results in improved performance with respect to the ones attained when neglecting the similarities among devices and when enforcing consensus among their policies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v144-ferrarotti21a, title = {The benefits of sharing: a cloud-aided performance-driven framework to learn optimal feedback policies}, author = {Ferrarotti, Laura and Breschi, Valentina and Bemporad, Alberto}, booktitle = {Proceedings of the 3rd Conference on Learning for Dynamics and Control}, pages = {87--98}, year = {2021}, editor = {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A. Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.}, volume = {144}, series = {Proceedings of Machine Learning Research}, month = {07 -- 08 June}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v144/ferrarotti21a/ferrarotti21a.pdf}, url = {https://proceedings.mlr.press/v144/ferrarotti21a.html}, abstract = {Mass-produced self-regulating systems are constructed and calibrated to be nominally the same and have similar goals. When several of them can share information with the cloud, their similarities can be exploited to improve the design of individual control policies. In this multi-agent framework, we aim at exploiting these similarities and the connection to the cloud to solve a sharing-based control policy optimization, so as to leverage on information provided by “trustworthy” agents. In this paper, we propose to combine the optimal policy search method introduced in (Ferrarotti and Bemporad, 2019) with the Alternating Direction Method of Multipliers, by relying on weighted surrogate of the experiences of each device, shared with the cloud. A preliminary example shows the effectiveness of the proposed sharing-based method, that results in improved performance with respect to the ones attained when neglecting the similarities among devices and when enforcing consensus among their policies.} }
Endnote
%0 Conference Paper %T The benefits of sharing: a cloud-aided performance-driven framework to learn optimal feedback policies %A Laura Ferrarotti %A Valentina Breschi %A Alberto Bemporad %B Proceedings of the 3rd Conference on Learning for Dynamics and Control %C Proceedings of Machine Learning Research %D 2021 %E Ali Jadbabaie %E John Lygeros %E George J. Pappas %E Pablo A. Parrilo %E Benjamin Recht %E Claire J. Tomlin %E Melanie N. Zeilinger %F pmlr-v144-ferrarotti21a %I PMLR %P 87--98 %U https://proceedings.mlr.press/v144/ferrarotti21a.html %V 144 %X Mass-produced self-regulating systems are constructed and calibrated to be nominally the same and have similar goals. When several of them can share information with the cloud, their similarities can be exploited to improve the design of individual control policies. In this multi-agent framework, we aim at exploiting these similarities and the connection to the cloud to solve a sharing-based control policy optimization, so as to leverage on information provided by “trustworthy” agents. In this paper, we propose to combine the optimal policy search method introduced in (Ferrarotti and Bemporad, 2019) with the Alternating Direction Method of Multipliers, by relying on weighted surrogate of the experiences of each device, shared with the cloud. A preliminary example shows the effectiveness of the proposed sharing-based method, that results in improved performance with respect to the ones attained when neglecting the similarities among devices and when enforcing consensus among their policies.
APA
Ferrarotti, L., Breschi, V. & Bemporad, A.. (2021). The benefits of sharing: a cloud-aided performance-driven framework to learn optimal feedback policies. Proceedings of the 3rd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 144:87-98 Available from https://proceedings.mlr.press/v144/ferrarotti21a.html.

Related Material