Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching

Cheng Lu, Kaiwen Zheng, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:14429-14460, 2022.

Abstract

Score-based generative models have excellent performance in terms of generation quality and likelihood. They model the data distribution by matching a parameterized score network with first-order data score functions. The score network can be used to define an ODE (“score-based diffusion ODE”) for exact likelihood evaluation. However, the relationship between the likelihood of the ODE and the score matching objective is unclear. In this work, we prove that matching the first-order score is not sufficient to maximize the likelihood of the ODE, by showing a gap between the maximum likelihood and score matching objectives. To fill up this gap, we show that the negative likelihood of the ODE can be bounded by controlling the first, second, and third-order score matching errors; and we further present a novel high-order denoising score matching method to enable maximum likelihood training of score-based diffusion ODEs. Our algorithm guarantees that the higher-order matching error is bounded by the training error and the lower-order errors. We empirically observe that by high-order score matching, score-based diffusion ODEs achieve better likelihood on both synthetic data and CIFAR-10, while retaining the high generation quality.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-lu22f, title = {Maximum Likelihood Training for Score-based Diffusion {ODE}s by High Order Denoising Score Matching}, author = {Lu, Cheng and Zheng, Kaiwen and Bao, Fan and Chen, Jianfei and Li, Chongxuan and Zhu, Jun}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {14429--14460}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/lu22f/lu22f.pdf}, url = {https://proceedings.mlr.press/v162/lu22f.html}, abstract = {Score-based generative models have excellent performance in terms of generation quality and likelihood. They model the data distribution by matching a parameterized score network with first-order data score functions. The score network can be used to define an ODE (“score-based diffusion ODE”) for exact likelihood evaluation. However, the relationship between the likelihood of the ODE and the score matching objective is unclear. In this work, we prove that matching the first-order score is not sufficient to maximize the likelihood of the ODE, by showing a gap between the maximum likelihood and score matching objectives. To fill up this gap, we show that the negative likelihood of the ODE can be bounded by controlling the first, second, and third-order score matching errors; and we further present a novel high-order denoising score matching method to enable maximum likelihood training of score-based diffusion ODEs. Our algorithm guarantees that the higher-order matching error is bounded by the training error and the lower-order errors. We empirically observe that by high-order score matching, score-based diffusion ODEs achieve better likelihood on both synthetic data and CIFAR-10, while retaining the high generation quality.} }
Endnote
%0 Conference Paper %T Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching %A Cheng Lu %A Kaiwen Zheng %A Fan Bao %A Jianfei Chen %A Chongxuan Li %A Jun Zhu %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-lu22f %I PMLR %P 14429--14460 %U https://proceedings.mlr.press/v162/lu22f.html %V 162 %X Score-based generative models have excellent performance in terms of generation quality and likelihood. They model the data distribution by matching a parameterized score network with first-order data score functions. The score network can be used to define an ODE (“score-based diffusion ODE”) for exact likelihood evaluation. However, the relationship between the likelihood of the ODE and the score matching objective is unclear. In this work, we prove that matching the first-order score is not sufficient to maximize the likelihood of the ODE, by showing a gap between the maximum likelihood and score matching objectives. To fill up this gap, we show that the negative likelihood of the ODE can be bounded by controlling the first, second, and third-order score matching errors; and we further present a novel high-order denoising score matching method to enable maximum likelihood training of score-based diffusion ODEs. Our algorithm guarantees that the higher-order matching error is bounded by the training error and the lower-order errors. We empirically observe that by high-order score matching, score-based diffusion ODEs achieve better likelihood on both synthetic data and CIFAR-10, while retaining the high generation quality.
APA
Lu, C., Zheng, K., Bao, F., Chen, J., Li, C. & Zhu, J.. (2022). Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:14429-14460 Available from https://proceedings.mlr.press/v162/lu22f.html.

Related Material