A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning

Abi Komanduru, Jean Honorio
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:5676-5685, 2021.

Abstract

Inverse reinforcement learning (IRL) is the task of finding a reward function that generates a desired optimal policy for a given Markov Decision Process (MDP). This paper develops an information-theoretic lower bound for the sample complexity of the finite state, finite action IRL problem. A geometric construction of $\beta$-strict separable IRL problems using spherical codes is considered. Properties of the ensemble size as well as the Kullback-Leibler divergence between the generated trajectories are derived. The resulting ensemble is then used along with Fano’s inequality to derive a sample complexity lower bound of $O(n \log n)$, where $n$ is the number of states in the MDP.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-komanduru21a, title = {A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning}, author = {Komanduru, Abi and Honorio, Jean}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {5676--5685}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/komanduru21a/komanduru21a.pdf}, url = {https://proceedings.mlr.press/v139/komanduru21a.html}, abstract = {Inverse reinforcement learning (IRL) is the task of finding a reward function that generates a desired optimal policy for a given Markov Decision Process (MDP). This paper develops an information-theoretic lower bound for the sample complexity of the finite state, finite action IRL problem. A geometric construction of $\beta$-strict separable IRL problems using spherical codes is considered. Properties of the ensemble size as well as the Kullback-Leibler divergence between the generated trajectories are derived. The resulting ensemble is then used along with Fano’s inequality to derive a sample complexity lower bound of $O(n \log n)$, where $n$ is the number of states in the MDP.} }
Endnote
%0 Conference Paper %T A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning %A Abi Komanduru %A Jean Honorio %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-komanduru21a %I PMLR %P 5676--5685 %U https://proceedings.mlr.press/v139/komanduru21a.html %V 139 %X Inverse reinforcement learning (IRL) is the task of finding a reward function that generates a desired optimal policy for a given Markov Decision Process (MDP). This paper develops an information-theoretic lower bound for the sample complexity of the finite state, finite action IRL problem. A geometric construction of $\beta$-strict separable IRL problems using spherical codes is considered. Properties of the ensemble size as well as the Kullback-Leibler divergence between the generated trajectories are derived. The resulting ensemble is then used along with Fano’s inequality to derive a sample complexity lower bound of $O(n \log n)$, where $n$ is the number of states in the MDP.
APA
Komanduru, A. & Honorio, J.. (2021). A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:5676-5685 Available from https://proceedings.mlr.press/v139/komanduru21a.html.

Related Material