Q-function Decomposition with Intervention Semantics for Factored Action Spaces

Junkyu Lee, Tian Gao, Elliot Nelson, Miao Liu, Debarun Bhattacharjya, Songtao Lu
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:1027-1035, 2025.

Abstract

Many practical reinforcement learning environments have a discrete factored action space that induces a large combinatorial set of actions, thereby posing significant challenges. Existing approaches leverage the regular structure of the action space and resort to a linear decomposition of Q-functions, which avoids enumerating all combinations of factored actions. In this paper, we consider Q-functions defined over a lower dimensional projected subspace of the original action space, and study the condition for the unbiasedness of decomposed Q-functions using causal effect estimation from the no unobserved confounder setting in causal statistics. This leads to a general scheme which we call action decomposed reinforcement learning that uses the projected Q-functions to approximate the Q-function in standard model-free reinforcement learning algorithms. The proposed approach is shown to improve sample complexity in a model-based reinforcement learning setting. We demonstrate improvements in sample efficiency compared to state-of-the-art baselines in online continuous control environments and a real-world offline sepsis treatment environment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-lee25c, title = {Q-function Decomposition with Intervention Semantics for Factored Action Spaces}, author = {Lee, Junkyu and Gao, Tian and Nelson, Elliot and Liu, Miao and Bhattacharjya, Debarun and Lu, Songtao}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {1027--1035}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/lee25c/lee25c.pdf}, url = {https://proceedings.mlr.press/v258/lee25c.html}, abstract = {Many practical reinforcement learning environments have a discrete factored action space that induces a large combinatorial set of actions, thereby posing significant challenges. Existing approaches leverage the regular structure of the action space and resort to a linear decomposition of Q-functions, which avoids enumerating all combinations of factored actions. In this paper, we consider Q-functions defined over a lower dimensional projected subspace of the original action space, and study the condition for the unbiasedness of decomposed Q-functions using causal effect estimation from the no unobserved confounder setting in causal statistics. This leads to a general scheme which we call action decomposed reinforcement learning that uses the projected Q-functions to approximate the Q-function in standard model-free reinforcement learning algorithms. The proposed approach is shown to improve sample complexity in a model-based reinforcement learning setting. We demonstrate improvements in sample efficiency compared to state-of-the-art baselines in online continuous control environments and a real-world offline sepsis treatment environment.} }
Endnote
%0 Conference Paper %T Q-function Decomposition with Intervention Semantics for Factored Action Spaces %A Junkyu Lee %A Tian Gao %A Elliot Nelson %A Miao Liu %A Debarun Bhattacharjya %A Songtao Lu %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-lee25c %I PMLR %P 1027--1035 %U https://proceedings.mlr.press/v258/lee25c.html %V 258 %X Many practical reinforcement learning environments have a discrete factored action space that induces a large combinatorial set of actions, thereby posing significant challenges. Existing approaches leverage the regular structure of the action space and resort to a linear decomposition of Q-functions, which avoids enumerating all combinations of factored actions. In this paper, we consider Q-functions defined over a lower dimensional projected subspace of the original action space, and study the condition for the unbiasedness of decomposed Q-functions using causal effect estimation from the no unobserved confounder setting in causal statistics. This leads to a general scheme which we call action decomposed reinforcement learning that uses the projected Q-functions to approximate the Q-function in standard model-free reinforcement learning algorithms. The proposed approach is shown to improve sample complexity in a model-based reinforcement learning setting. We demonstrate improvements in sample efficiency compared to state-of-the-art baselines in online continuous control environments and a real-world offline sepsis treatment environment.
APA
Lee, J., Gao, T., Nelson, E., Liu, M., Bhattacharjya, D. & Lu, S.. (2025). Q-function Decomposition with Intervention Semantics for Factored Action Spaces. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:1027-1035 Available from https://proceedings.mlr.press/v258/lee25c.html.

Related Material