An Information Theoretic Approach to Interaction-Grounded Learning

Xiaoyan Hu; Farzan Farnia; Ho-Fung Leung

An Information Theoretic Approach to Interaction-Grounded Learning

Xiaoyan Hu, Farzan Farnia, Ho-Fung Leung

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:19198-19215, 2024.

Abstract

Reinforcement learning (RL) problems where the learner attempts to infer an unobserved reward from some feedback variables have been studied in several recent papers. The setting of Interaction-Grounded Learning (IGL) is an example of such feedback-based reinforcement learning tasks where the learner optimizes the return by inferring latent binary rewards from the interaction with the environment. In the IGL setting, a relevant assumption used in the RL literature is that the feedback variable

$Y$ is conditionally independent of the context-action

$(X,A)$ given the latent reward

$R$ . In this work, we propose Variational Information-based IGL (VI-IGL) as an information-theoretic method to enforce the conditional independence assumption in the IGL-based RL problem. The VI-IGL framework learns a reward decoder using an information-based objective based on the conditional mutual information (MI) between the context-action

$(X,A)$ and the feedback variable

$Y$ observed from the environment. To estimate and optimize the information-based terms for the continuous random variables in the RL problem, VI-IGL leverages the variational representation of mutual information and results in a min-max optimization problem. Theoretical analysis shows that the optimization problem can be sample-efficiently solved. Furthermore, we extend the VI-IGL framework to general

$f$ -Information measures in the information theory literature, leading to the generalized

$f$ -VI-IGL framework to address the RL problem under the IGL condition. Finally, the empirical results on several reinforcement learning settings indicate an improved performance in comparison to the previous IGL-based RL algorithm.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-hu24e,
  title = 	 {An Information Theoretic Approach to Interaction-Grounded Learning},
  author =       {Hu, Xiaoyan and Farnia, Farzan and Leung, Ho-Fung},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {19198--19215},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/hu24e/hu24e.pdf},
  url = 	 {https://proceedings.mlr.press/v235/hu24e.html},
  abstract = 	 {Reinforcement learning (RL) problems where the learner attempts to infer an unobserved reward from some feedback variables have been studied in several recent papers. The setting of Interaction-Grounded Learning (IGL) is an example of such feedback-based reinforcement learning tasks where the learner optimizes the return by inferring latent binary rewards from the interaction with the environment. In the IGL setting, a relevant assumption used in the RL literature is that the feedback variable $Y$ is conditionally independent of the context-action $(X,A)$ given the latent reward $R$. In this work, we propose Variational Information-based IGL (VI-IGL) as an information-theoretic method to enforce the conditional independence assumption in the IGL-based RL problem. The VI-IGL framework learns a reward decoder using an information-based objective based on the conditional mutual information (MI) between the context-action $(X,A)$ and the feedback variable $Y$ observed from the environment. To estimate and optimize the information-based terms for the continuous random variables in the RL problem, VI-IGL leverages the variational representation of mutual information and results in a min-max optimization problem. Theoretical analysis shows that the optimization problem can be sample-efficiently solved. Furthermore, we extend the VI-IGL framework to general $f$-Information measures in the information theory literature, leading to the generalized $f$-VI-IGL framework to address the RL problem under the IGL condition. Finally, the empirical results on several reinforcement learning settings indicate an improved performance in comparison to the previous IGL-based RL algorithm.}
}

Endnote

%0 Conference Paper
%T An Information Theoretic Approach to Interaction-Grounded Learning
%A Xiaoyan Hu
%A Farzan Farnia
%A Ho-Fung Leung
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-hu24e
%I PMLR
%P 19198--19215
%U https://proceedings.mlr.press/v235/hu24e.html
%V 235
%X Reinforcement learning (RL) problems where the learner attempts to infer an unobserved reward from some feedback variables have been studied in several recent papers. The setting of Interaction-Grounded Learning (IGL) is an example of such feedback-based reinforcement learning tasks where the learner optimizes the return by inferring latent binary rewards from the interaction with the environment. In the IGL setting, a relevant assumption used in the RL literature is that the feedback variable $Y$ is conditionally independent of the context-action $(X,A)$ given the latent reward $R$. In this work, we propose Variational Information-based IGL (VI-IGL) as an information-theoretic method to enforce the conditional independence assumption in the IGL-based RL problem. The VI-IGL framework learns a reward decoder using an information-based objective based on the conditional mutual information (MI) between the context-action $(X,A)$ and the feedback variable $Y$ observed from the environment. To estimate and optimize the information-based terms for the continuous random variables in the RL problem, VI-IGL leverages the variational representation of mutual information and results in a min-max optimization problem. Theoretical analysis shows that the optimization problem can be sample-efficiently solved. Furthermore, we extend the VI-IGL framework to general $f$-Information measures in the information theory literature, leading to the generalized $f$-VI-IGL framework to address the RL problem under the IGL condition. Finally, the empirical results on several reinforcement learning settings indicate an improved performance in comparison to the previous IGL-based RL algorithm.

APA


Hu, X., Farnia, F. & Leung, H.. (2024). An Information Theoretic Approach to Interaction-Grounded Learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:19198-19215 Available from https://proceedings.mlr.press/v235/hu24e.html.

An Information Theoretic Approach to Interaction-Grounded Learning

Abstract

Cite this Paper

Related Material