Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation

Dongjun Kim, Seungjae Shin, Kyungwoo Song, Wanmo Kang, Il-Chul Moon
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:11201-11228, 2022.

Abstract

Recent advances in diffusion models bring state-of-the-art performance on image generation tasks. However, empirical results from previous research in diffusion models imply an inverse correlation between density estimation and sample generation performances. This paper investigates with sufficient empirical evidence that such inverse correlation happens because density estimation is significantly contributed by small diffusion time, whereas sample generation mainly depends on large diffusion time. However, training a score network well across the entire diffusion time is demanding because the loss scale is significantly imbalanced at each diffusion time. For successful training, therefore, we introduce Soft Truncation, a universally applicable training technique for diffusion models, that softens the fixed and static truncation hyperparameter into a random variable. In experiments, Soft Truncation achieves state-of-the-art performance on CIFAR-10, CelebA, CelebA-HQ $256\times 256$, and STL-10 datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-kim22i, title = {Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation}, author = {Kim, Dongjun and Shin, Seungjae and Song, Kyungwoo and Kang, Wanmo and Moon, Il-Chul}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {11201--11228}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/kim22i/kim22i.pdf}, url = {https://proceedings.mlr.press/v162/kim22i.html}, abstract = {Recent advances in diffusion models bring state-of-the-art performance on image generation tasks. However, empirical results from previous research in diffusion models imply an inverse correlation between density estimation and sample generation performances. This paper investigates with sufficient empirical evidence that such inverse correlation happens because density estimation is significantly contributed by small diffusion time, whereas sample generation mainly depends on large diffusion time. However, training a score network well across the entire diffusion time is demanding because the loss scale is significantly imbalanced at each diffusion time. For successful training, therefore, we introduce Soft Truncation, a universally applicable training technique for diffusion models, that softens the fixed and static truncation hyperparameter into a random variable. In experiments, Soft Truncation achieves state-of-the-art performance on CIFAR-10, CelebA, CelebA-HQ $256\times 256$, and STL-10 datasets.} }
Endnote
%0 Conference Paper %T Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation %A Dongjun Kim %A Seungjae Shin %A Kyungwoo Song %A Wanmo Kang %A Il-Chul Moon %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-kim22i %I PMLR %P 11201--11228 %U https://proceedings.mlr.press/v162/kim22i.html %V 162 %X Recent advances in diffusion models bring state-of-the-art performance on image generation tasks. However, empirical results from previous research in diffusion models imply an inverse correlation between density estimation and sample generation performances. This paper investigates with sufficient empirical evidence that such inverse correlation happens because density estimation is significantly contributed by small diffusion time, whereas sample generation mainly depends on large diffusion time. However, training a score network well across the entire diffusion time is demanding because the loss scale is significantly imbalanced at each diffusion time. For successful training, therefore, we introduce Soft Truncation, a universally applicable training technique for diffusion models, that softens the fixed and static truncation hyperparameter into a random variable. In experiments, Soft Truncation achieves state-of-the-art performance on CIFAR-10, CelebA, CelebA-HQ $256\times 256$, and STL-10 datasets.
APA
Kim, D., Shin, S., Song, K., Kang, W. & Moon, I.. (2022). Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:11201-11228 Available from https://proceedings.mlr.press/v162/kim22i.html.

Related Material