On the connection between Noise-Contrastive Estimation and Contrastive Divergence

Amanda Olmin, Jakob Lindqvist, Lennart Svensson, Fredrik Lindsten
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3016-3024, 2024.

Abstract

Noise-contrastive estimation (NCE) is a popular method for estimating unnormalised probabilistic models, such as energy-based models, which are effective for modelling complex data distributions. Unlike classical maximum likelihood (ML) estimation that relies on importance sampling (resulting in ML-IS) or MCMC (resulting in contrastive divergence, CD), NCE uses a proxy criterion to avoid the need for evaluating an often intractable normalisation constant. Despite apparent conceptual differences, we show that two NCE criteria, ranking NCE (RNCE) and conditional NCE (CNCE), can be viewed as ML estimation methods. Specifically, RNCE is equivalent to ML estimation combined with conditional importance sampling, and both RNCE and CNCE are special cases of CD. These findings bridge the gap between the two method classes and allow us to apply techniques from the ML-IS and CD literature to NCE, offering several advantageous extensions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-olmin24a, title = { On the connection between Noise-Contrastive Estimation and Contrastive Divergence }, author = {Olmin, Amanda and Lindqvist, Jakob and Svensson, Lennart and Lindsten, Fredrik}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3016--3024}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/olmin24a/olmin24a.pdf}, url = {https://proceedings.mlr.press/v238/olmin24a.html}, abstract = { Noise-contrastive estimation (NCE) is a popular method for estimating unnormalised probabilistic models, such as energy-based models, which are effective for modelling complex data distributions. Unlike classical maximum likelihood (ML) estimation that relies on importance sampling (resulting in ML-IS) or MCMC (resulting in contrastive divergence, CD), NCE uses a proxy criterion to avoid the need for evaluating an often intractable normalisation constant. Despite apparent conceptual differences, we show that two NCE criteria, ranking NCE (RNCE) and conditional NCE (CNCE), can be viewed as ML estimation methods. Specifically, RNCE is equivalent to ML estimation combined with conditional importance sampling, and both RNCE and CNCE are special cases of CD. These findings bridge the gap between the two method classes and allow us to apply techniques from the ML-IS and CD literature to NCE, offering several advantageous extensions. } }
Endnote
%0 Conference Paper %T On the connection between Noise-Contrastive Estimation and Contrastive Divergence %A Amanda Olmin %A Jakob Lindqvist %A Lennart Svensson %A Fredrik Lindsten %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-olmin24a %I PMLR %P 3016--3024 %U https://proceedings.mlr.press/v238/olmin24a.html %V 238 %X Noise-contrastive estimation (NCE) is a popular method for estimating unnormalised probabilistic models, such as energy-based models, which are effective for modelling complex data distributions. Unlike classical maximum likelihood (ML) estimation that relies on importance sampling (resulting in ML-IS) or MCMC (resulting in contrastive divergence, CD), NCE uses a proxy criterion to avoid the need for evaluating an often intractable normalisation constant. Despite apparent conceptual differences, we show that two NCE criteria, ranking NCE (RNCE) and conditional NCE (CNCE), can be viewed as ML estimation methods. Specifically, RNCE is equivalent to ML estimation combined with conditional importance sampling, and both RNCE and CNCE are special cases of CD. These findings bridge the gap between the two method classes and allow us to apply techniques from the ML-IS and CD literature to NCE, offering several advantageous extensions.
APA
Olmin, A., Lindqvist, J., Svensson, L. & Lindsten, F.. (2024). On the connection between Noise-Contrastive Estimation and Contrastive Divergence . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3016-3024 Available from https://proceedings.mlr.press/v238/olmin24a.html.

Related Material