Pointwise-in-time diagnostics for reinforcement learning during training and runtime

Noel Brindise, Andres Posada Moreno, Cedric Langbort, Sebastian Trimpe
Proceedings of the 6th Annual Learning for Dynamics & Control Conference, PMLR 242:694-706, 2024.

Abstract

Explainable AI Planning (XAIP), a subfield of xAI, offers a variety of methods to interpret the behavior of autonomous systems. A recent “pointwise-in-time” explanation method, called Rule Status Assessment (RSA), characterizes an agent’s behavior at individual time steps in a trajectory using linear temporal logic (LTL) rules. In this work, RSA is applied for the first time in a reinforcement learning (RL) context. We first demonstrate RSA diagnostics as a substantial supplement to the basic RL reward curve, tracking whether and when specified subtasks are accomplished. We then introduce a novel “Interactive RSA” which provides the user with detailed diagnostic information automatically at any desired point in a trajectory. We apply RSA to an advanced agent at runtime and show that RSA and its novel interactive variant constitute a promising step towards explainable RL.

Cite this Paper


BibTeX
@InProceedings{pmlr-v242-brindise24a, title = {Pointwise-in-time diagnostics for reinforcement learning during training and runtime}, author = {Brindise, Noel and Moreno, Andres Posada and Langbort, Cedric and Trimpe, Sebastian}, booktitle = {Proceedings of the 6th Annual Learning for Dynamics & Control Conference}, pages = {694--706}, year = {2024}, editor = {Abate, Alessandro and Cannon, Mark and Margellos, Kostas and Papachristodoulou, Antonis}, volume = {242}, series = {Proceedings of Machine Learning Research}, month = {15--17 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v242/brindise24a/brindise24a.pdf}, url = {https://proceedings.mlr.press/v242/brindise24a.html}, abstract = {Explainable AI Planning (XAIP), a subfield of xAI, offers a variety of methods to interpret the behavior of autonomous systems. A recent “pointwise-in-time” explanation method, called Rule Status Assessment (RSA), characterizes an agent’s behavior at individual time steps in a trajectory using linear temporal logic (LTL) rules. In this work, RSA is applied for the first time in a reinforcement learning (RL) context. We first demonstrate RSA diagnostics as a substantial supplement to the basic RL reward curve, tracking whether and when specified subtasks are accomplished. We then introduce a novel “Interactive RSA” which provides the user with detailed diagnostic information automatically at any desired point in a trajectory. We apply RSA to an advanced agent at runtime and show that RSA and its novel interactive variant constitute a promising step towards explainable RL.} }
Endnote
%0 Conference Paper %T Pointwise-in-time diagnostics for reinforcement learning during training and runtime %A Noel Brindise %A Andres Posada Moreno %A Cedric Langbort %A Sebastian Trimpe %B Proceedings of the 6th Annual Learning for Dynamics & Control Conference %C Proceedings of Machine Learning Research %D 2024 %E Alessandro Abate %E Mark Cannon %E Kostas Margellos %E Antonis Papachristodoulou %F pmlr-v242-brindise24a %I PMLR %P 694--706 %U https://proceedings.mlr.press/v242/brindise24a.html %V 242 %X Explainable AI Planning (XAIP), a subfield of xAI, offers a variety of methods to interpret the behavior of autonomous systems. A recent “pointwise-in-time” explanation method, called Rule Status Assessment (RSA), characterizes an agent’s behavior at individual time steps in a trajectory using linear temporal logic (LTL) rules. In this work, RSA is applied for the first time in a reinforcement learning (RL) context. We first demonstrate RSA diagnostics as a substantial supplement to the basic RL reward curve, tracking whether and when specified subtasks are accomplished. We then introduce a novel “Interactive RSA” which provides the user with detailed diagnostic information automatically at any desired point in a trajectory. We apply RSA to an advanced agent at runtime and show that RSA and its novel interactive variant constitute a promising step towards explainable RL.
APA
Brindise, N., Moreno, A.P., Langbort, C. & Trimpe, S.. (2024). Pointwise-in-time diagnostics for reinforcement learning during training and runtime. Proceedings of the 6th Annual Learning for Dynamics & Control Conference, in Proceedings of Machine Learning Research 242:694-706 Available from https://proceedings.mlr.press/v242/brindise24a.html.

Related Material