The benefits of sharing: a cloud-aided performance-driven framework to learn optimal feedback policies
Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR 144:87-98, 2021.
Mass-produced self-regulating systems are constructed and calibrated to be nominally the same and have similar goals. When several of them can share information with the cloud, their similarities can be exploited to improve the design of individual control policies. In this multi-agent framework, we aim at exploiting these similarities and the connection to the cloud to solve a sharing-based control policy optimization, so as to leverage on information provided by “trustworthy” agents. In this paper, we propose to combine the optimal policy search method introduced in (Ferrarotti and Bemporad, 2019) with the Alternating Direction Method of Multipliers, by relying on weighted surrogate of the experiences of each device, shared with the cloud. A preliminary example shows the effectiveness of the proposed sharing-based method, that results in improved performance with respect to the ones attained when neglecting the similarities among devices and when enforcing consensus among their policies.