[edit]
Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:1455-1466, 2023.
Abstract
In this work, we present the hyperparameter optimization of an online, off-policy reinforcement learning algorithm based on a parallel search. Since this model-free learning algorithm solves the H∞ optimal tracking problem iteratively using ordinary least squares regression, we propose using the condition number of the data matrix as a model-free measure for tuning the hyperparameters. This addition enables automated optimization of the involved hyperparameters. We demonstrate that the condition number is a useful metric for tuning the number of collected samples, sampling interval, and other hyperparameters involved. In addition, we demonstrate a correlation between this condition number and properties of the sum of sinusoids persistent excitation.