Towards Understanding and Improving GFlowNet Training

Max W Shen; Emmanuel Bengio; Ehsan Hajiramezanali; Andreas Loukas; Kyunghyun Cho; Tommaso Biancalani

Towards Understanding and Improving GFlowNet Training

Max W Shen, Emmanuel Bengio, Ehsan Hajiramezanali, Andreas Loukas, Kyunghyun Cho, Tommaso Biancalani

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:30956-30975, 2023.

Abstract

Generative flow networks (GFlowNets) are a family of algorithms that learn a generative policy to sample discrete objects $x$ with non-negative reward $R(x)$. Learning objectives guarantee the GFlowNet samples $x$ from the target distribution $p^*(x) \propto R(x)$ when loss is globally minimized over all states or trajectories, but it is unclear how well they perform with practical limits on training resources. We introduce an efficient evaluation strategy to compare the learned sampling distribution to the target reward distribution. As flows can be underdetermined given training data, we clarify the importance of learned flows to generalization and matching $p^*(x)$ in practice. We investigate how to learn better flows, and propose (i) prioritized replay training of high-reward $x$, (ii) relative edge flow policy parametrization, and (iii) a novel guided trajectory balance objective, and show how it can solve a substructure credit assignment problem. We substantially improve sample efficiency on biochemical design tasks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-shen23a,
  title = 	 {Towards Understanding and Improving {GF}low{N}et Training},
  author =       {Shen, Max W and Bengio, Emmanuel and Hajiramezanali, Ehsan and Loukas, Andreas and Cho, Kyunghyun and Biancalani, Tommaso},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {30956--30975},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/shen23a/shen23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/shen23a.html},
  abstract = 	 {Generative flow networks (GFlowNets) are a family of algorithms that learn a generative policy to sample discrete objects $x$ with non-negative reward $R(x)$. Learning objectives guarantee the GFlowNet samples $x$ from the target distribution $p^*(x) \propto R(x)$ when loss is globally minimized over all states or trajectories, but it is unclear how well they perform with practical limits on training resources. We introduce an efficient evaluation strategy to compare the learned sampling distribution to the target reward distribution. As flows can be underdetermined given training data, we clarify the importance of learned flows to generalization and matching $p^*(x)$ in practice. We investigate how to learn better flows, and propose (i) prioritized replay training of high-reward $x$, (ii) relative edge flow policy parametrization, and (iii) a novel guided trajectory balance objective, and show how it can solve a substructure credit assignment problem. We substantially improve sample efficiency on biochemical design tasks.}
}

Endnote

%0 Conference Paper
%T Towards Understanding and Improving GFlowNet Training
%A Max W Shen
%A Emmanuel Bengio
%A Ehsan Hajiramezanali
%A Andreas Loukas
%A Kyunghyun Cho
%A Tommaso Biancalani
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-shen23a
%I PMLR
%P 30956--30975
%U https://proceedings.mlr.press/v202/shen23a.html
%V 202
%X Generative flow networks (GFlowNets) are a family of algorithms that learn a generative policy to sample discrete objects $x$ with non-negative reward $R(x)$. Learning objectives guarantee the GFlowNet samples $x$ from the target distribution $p^*(x) \propto R(x)$ when loss is globally minimized over all states or trajectories, but it is unclear how well they perform with practical limits on training resources. We introduce an efficient evaluation strategy to compare the learned sampling distribution to the target reward distribution. As flows can be underdetermined given training data, we clarify the importance of learned flows to generalization and matching $p^*(x)$ in practice. We investigate how to learn better flows, and propose (i) prioritized replay training of high-reward $x$, (ii) relative edge flow policy parametrization, and (iii) a novel guided trajectory balance objective, and show how it can solve a substructure credit assignment problem. We substantially improve sample efficiency on biochemical design tasks.

APA


Shen, M.W., Bengio, E., Hajiramezanali, E., Loukas, A., Cho, K. & Biancalani, T.. (2023). Towards Understanding and Improving GFlowNet Training. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:30956-30975 Available from https://proceedings.mlr.press/v202/shen23a.html.

Towards Understanding and Improving GFlowNet Training

Abstract

Cite this Paper

Related Material