Generative Flow Networks as Entropy-Regularized RL

Daniil Tiapkin, Nikita Morozov, Alexey Naumov, Dmitry P Vetrov
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4213-4221, 2024.

Abstract

The recently proposed generative flow networks (GFlowNets) are a method of training a policy to sample compositional discrete objects with probabilities proportional to a given reward via a sequence of actions. GFlowNets exploit the sequential nature of the problem, drawing parallels with reinforcement learning (RL). Our work extends the connection between RL and GFlowNets to a general case. We demonstrate how the task of learning a generative flow network can be efficiently redefined as an entropy-regularized RL problem with a specific reward and regularizer structure. Furthermore, we illustrate the practical efficiency of this reformulation by applying standard soft RL algorithms to GFlowNet training across several probabilistic modeling tasks. Contrary to previously reported results, we show that entropic RL approaches can be competitive against established GFlowNet training methods. This perspective opens a direct path for integrating RL principles into the realm of generative flow networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-tiapkin24a, title = { Generative Flow Networks as Entropy-Regularized {RL} }, author = {Tiapkin, Daniil and Morozov, Nikita and Naumov, Alexey and P Vetrov, Dmitry}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {4213--4221}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/tiapkin24a/tiapkin24a.pdf}, url = {https://proceedings.mlr.press/v238/tiapkin24a.html}, abstract = { The recently proposed generative flow networks (GFlowNets) are a method of training a policy to sample compositional discrete objects with probabilities proportional to a given reward via a sequence of actions. GFlowNets exploit the sequential nature of the problem, drawing parallels with reinforcement learning (RL). Our work extends the connection between RL and GFlowNets to a general case. We demonstrate how the task of learning a generative flow network can be efficiently redefined as an entropy-regularized RL problem with a specific reward and regularizer structure. Furthermore, we illustrate the practical efficiency of this reformulation by applying standard soft RL algorithms to GFlowNet training across several probabilistic modeling tasks. Contrary to previously reported results, we show that entropic RL approaches can be competitive against established GFlowNet training methods. This perspective opens a direct path for integrating RL principles into the realm of generative flow networks. } }
Endnote
%0 Conference Paper %T Generative Flow Networks as Entropy-Regularized RL %A Daniil Tiapkin %A Nikita Morozov %A Alexey Naumov %A Dmitry P Vetrov %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-tiapkin24a %I PMLR %P 4213--4221 %U https://proceedings.mlr.press/v238/tiapkin24a.html %V 238 %X The recently proposed generative flow networks (GFlowNets) are a method of training a policy to sample compositional discrete objects with probabilities proportional to a given reward via a sequence of actions. GFlowNets exploit the sequential nature of the problem, drawing parallels with reinforcement learning (RL). Our work extends the connection between RL and GFlowNets to a general case. We demonstrate how the task of learning a generative flow network can be efficiently redefined as an entropy-regularized RL problem with a specific reward and regularizer structure. Furthermore, we illustrate the practical efficiency of this reformulation by applying standard soft RL algorithms to GFlowNet training across several probabilistic modeling tasks. Contrary to previously reported results, we show that entropic RL approaches can be competitive against established GFlowNet training methods. This perspective opens a direct path for integrating RL principles into the realm of generative flow networks.
APA
Tiapkin, D., Morozov, N., Naumov, A. & P Vetrov, D.. (2024). Generative Flow Networks as Entropy-Regularized RL . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4213-4221 Available from https://proceedings.mlr.press/v238/tiapkin24a.html.

Related Material