Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization

Chenbei Lu, Laixi Shi, Zaiwei Chen, Chenye Wu, Adam Wierman
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:40614-40664, 2025.

Abstract

Factored Markov Decision Processes (FMDPs) offer a promising framework for overcoming the curse of dimensionality in reinforcement learning (RL) by decomposing high-dimensional MDPs into smaller and independently evolving components. Despite their potential, existing studies on FMDPs face three key limitations: reliance on perfectly factorizable models, suboptimal sample complexity guarantees for model-based algorithms, and the absence of model-free algorithms. To address these challenges, we introduce approximate factorization, which extends FMDPs to handle imperfectly factored models. Moreover, we develop a model-based algorithm and a model-free algorithm (in the form of variance-reduced Q-learning), both achieving the first near-minimax sample complexity guarantees for FMDPs. A key novelty in the design of these two algorithms is the development of a graph-coloring-based optimal synchronous sampling strategy. Numerical simulations based on the wind farm storage control problem corroborate our theoretical findings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-lu25j, title = {Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization}, author = {Lu, Chenbei and Shi, Laixi and Chen, Zaiwei and Wu, Chenye and Wierman, Adam}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {40614--40664}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/lu25j/lu25j.pdf}, url = {https://proceedings.mlr.press/v267/lu25j.html}, abstract = {Factored Markov Decision Processes (FMDPs) offer a promising framework for overcoming the curse of dimensionality in reinforcement learning (RL) by decomposing high-dimensional MDPs into smaller and independently evolving components. Despite their potential, existing studies on FMDPs face three key limitations: reliance on perfectly factorizable models, suboptimal sample complexity guarantees for model-based algorithms, and the absence of model-free algorithms. To address these challenges, we introduce approximate factorization, which extends FMDPs to handle imperfectly factored models. Moreover, we develop a model-based algorithm and a model-free algorithm (in the form of variance-reduced Q-learning), both achieving the first near-minimax sample complexity guarantees for FMDPs. A key novelty in the design of these two algorithms is the development of a graph-coloring-based optimal synchronous sampling strategy. Numerical simulations based on the wind farm storage control problem corroborate our theoretical findings.} }
Endnote
%0 Conference Paper %T Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization %A Chenbei Lu %A Laixi Shi %A Zaiwei Chen %A Chenye Wu %A Adam Wierman %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-lu25j %I PMLR %P 40614--40664 %U https://proceedings.mlr.press/v267/lu25j.html %V 267 %X Factored Markov Decision Processes (FMDPs) offer a promising framework for overcoming the curse of dimensionality in reinforcement learning (RL) by decomposing high-dimensional MDPs into smaller and independently evolving components. Despite their potential, existing studies on FMDPs face three key limitations: reliance on perfectly factorizable models, suboptimal sample complexity guarantees for model-based algorithms, and the absence of model-free algorithms. To address these challenges, we introduce approximate factorization, which extends FMDPs to handle imperfectly factored models. Moreover, we develop a model-based algorithm and a model-free algorithm (in the form of variance-reduced Q-learning), both achieving the first near-minimax sample complexity guarantees for FMDPs. A key novelty in the design of these two algorithms is the development of a graph-coloring-based optimal synchronous sampling strategy. Numerical simulations based on the wind farm storage control problem corroborate our theoretical findings.
APA
Lu, C., Shi, L., Chen, Z., Wu, C. & Wierman, A.. (2025). Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:40614-40664 Available from https://proceedings.mlr.press/v267/lu25j.html.

Related Material