Maximum entropy GFlowNets with soft Q-learning

Sobhan Mohammadpour, Emmanuel Bengio, Emma Frejinger, Pierre-Luc Bacon
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:2593-2601, 2024.

Abstract

Generative Flow Networks (GFNs) have emerged as a powerful tool for sampling discrete objects from unnormalized distributions, offering a scalable alternative to Markov Chain Monte Carlo (MCMC) methods. While GFNs draw inspiration from maximum entropy reinforcement learning (RL), the connection between the two has largely been unclear and seemingly applicable only in specific cases. This paper addresses the connection by constructing an appropriate reward function, thereby establishing an exact relationship between GFNs and maximum entropy RL. This construction allows us to introduce maximum entropy GFNs, which achieve the maximum entropy attainable by GFNs without constraints on the state space, in contrast to GFNs with uniform backward policy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-mohammadpour24a, title = {Maximum entropy {GFlowNets} with soft {Q}-learning}, author = {Mohammadpour, Sobhan and Bengio, Emmanuel and Frejinger, Emma and Bacon, Pierre-Luc}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {2593--2601}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/mohammadpour24a/mohammadpour24a.pdf}, url = {https://proceedings.mlr.press/v238/mohammadpour24a.html}, abstract = {Generative Flow Networks (GFNs) have emerged as a powerful tool for sampling discrete objects from unnormalized distributions, offering a scalable alternative to Markov Chain Monte Carlo (MCMC) methods. While GFNs draw inspiration from maximum entropy reinforcement learning (RL), the connection between the two has largely been unclear and seemingly applicable only in specific cases. This paper addresses the connection by constructing an appropriate reward function, thereby establishing an exact relationship between GFNs and maximum entropy RL. This construction allows us to introduce maximum entropy GFNs, which achieve the maximum entropy attainable by GFNs without constraints on the state space, in contrast to GFNs with uniform backward policy.} }
Endnote
%0 Conference Paper %T Maximum entropy GFlowNets with soft Q-learning %A Sobhan Mohammadpour %A Emmanuel Bengio %A Emma Frejinger %A Pierre-Luc Bacon %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-mohammadpour24a %I PMLR %P 2593--2601 %U https://proceedings.mlr.press/v238/mohammadpour24a.html %V 238 %X Generative Flow Networks (GFNs) have emerged as a powerful tool for sampling discrete objects from unnormalized distributions, offering a scalable alternative to Markov Chain Monte Carlo (MCMC) methods. While GFNs draw inspiration from maximum entropy reinforcement learning (RL), the connection between the two has largely been unclear and seemingly applicable only in specific cases. This paper addresses the connection by constructing an appropriate reward function, thereby establishing an exact relationship between GFNs and maximum entropy RL. This construction allows us to introduce maximum entropy GFNs, which achieve the maximum entropy attainable by GFNs without constraints on the state space, in contrast to GFNs with uniform backward policy.
APA
Mohammadpour, S., Bengio, E., Frejinger, E. & Bacon, P.. (2024). Maximum entropy GFlowNets with soft Q-learning. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:2593-2601 Available from https://proceedings.mlr.press/v238/mohammadpour24a.html.

Related Material