Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control

Alireza Farahmandi, Brian C Reitz, Mark Debord, Douglas Philbrick, Katia Estabridis, Gary Hewer
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:1455-1466, 2023.

Abstract

In this work, we present the hyperparameter optimization of an online, off-policy reinforcement learning algorithm based on a parallel search. Since this model-free learning algorithm solves the H∞ optimal tracking problem iteratively using ordinary least squares regression, we propose using the condition number of the data matrix as a model-free measure for tuning the hyperparameters. This addition enables automated optimization of the involved hyperparameters. We demonstrate that the condition number is a useful metric for tuning the number of collected samples, sampling interval, and other hyperparameters involved. In addition, we demonstrate a correlation between this condition number and properties of the sum of sinusoids persistent excitation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v211-farahmandi23a, title = {Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control}, author = {Farahmandi, Alireza and Reitz, Brian C and Debord, Mark and Philbrick, Douglas and Estabridis, Katia and Hewer, Gary}, booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference}, pages = {1455--1466}, year = {2023}, editor = {Matni, Nikolai and Morari, Manfred and Pappas, George J.}, volume = {211}, series = {Proceedings of Machine Learning Research}, month = {15--16 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v211/farahmandi23a/farahmandi23a.pdf}, url = {https://proceedings.mlr.press/v211/farahmandi23a.html}, abstract = {In this work, we present the hyperparameter optimization of an online, off-policy reinforcement learning algorithm based on a parallel search. Since this model-free learning algorithm solves the H∞ optimal tracking problem iteratively using ordinary least squares regression, we propose using the condition number of the data matrix as a model-free measure for tuning the hyperparameters. This addition enables automated optimization of the involved hyperparameters. We demonstrate that the condition number is a useful metric for tuning the number of collected samples, sampling interval, and other hyperparameters involved. In addition, we demonstrate a correlation between this condition number and properties of the sum of sinusoids persistent excitation.} }
Endnote
%0 Conference Paper %T Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control %A Alireza Farahmandi %A Brian C Reitz %A Mark Debord %A Douglas Philbrick %A Katia Estabridis %A Gary Hewer %B Proceedings of The 5th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2023 %E Nikolai Matni %E Manfred Morari %E George J. Pappas %F pmlr-v211-farahmandi23a %I PMLR %P 1455--1466 %U https://proceedings.mlr.press/v211/farahmandi23a.html %V 211 %X In this work, we present the hyperparameter optimization of an online, off-policy reinforcement learning algorithm based on a parallel search. Since this model-free learning algorithm solves the H∞ optimal tracking problem iteratively using ordinary least squares regression, we propose using the condition number of the data matrix as a model-free measure for tuning the hyperparameters. This addition enables automated optimization of the involved hyperparameters. We demonstrate that the condition number is a useful metric for tuning the number of collected samples, sampling interval, and other hyperparameters involved. In addition, we demonstrate a correlation between this condition number and properties of the sum of sinusoids persistent excitation.
APA
Farahmandi, A., Reitz, B.C., Debord, M., Philbrick, D., Estabridis, K. & Hewer, G.. (2023). Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:1455-1466 Available from https://proceedings.mlr.press/v211/farahmandi23a.html.

Related Material