Learning Sparse Codes with Entropy-Based ELBOs

Dmytro Velychko, Simon Damm, Asja Fischer, Jörg Lücke
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:2089-2097, 2024.

Abstract

Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference; (B) the novel objective is fully analytic; and (C) the objective allows for a novel principled form of annealing. The objective is derived by first showing that the standard ELBO objective converges to a sum of entropies, which matches similar recent results for generative models with Gaussian priors. The conditions under which the ELBO becomes equal to entropies are then shown to have analytic solutions, which leads to the fully analytic objective. Numerical experiments are used to demonstrate the feasibility of learning with such entropy-based ELBOs. We investigate different posterior approximations including Gaussians with correlated latents and deep amortized approximations. Furthermore, we numerically investigate entropy-based annealing which results in improved learning. Our main contributions are theoretical, however, and they are twofold: (1) we provide the first demonstration on how a recently shown convergence of the ELBO to entropy sums can be used for learning; and (2) using the entropy objective, we derive a fully analytic ELBO objective for the standard sparse coding generative model.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-velychko24a, title = {Learning Sparse Codes with Entropy-Based {ELBOs}}, author = {Velychko, Dmytro and Damm, Simon and Fischer, Asja and L\"{u}cke, J\"{o}rg}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {2089--2097}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/velychko24a/velychko24a.pdf}, url = {https://proceedings.mlr.press/v238/velychko24a.html}, abstract = {Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference; (B) the novel objective is fully analytic; and (C) the objective allows for a novel principled form of annealing. The objective is derived by first showing that the standard ELBO objective converges to a sum of entropies, which matches similar recent results for generative models with Gaussian priors. The conditions under which the ELBO becomes equal to entropies are then shown to have analytic solutions, which leads to the fully analytic objective. Numerical experiments are used to demonstrate the feasibility of learning with such entropy-based ELBOs. We investigate different posterior approximations including Gaussians with correlated latents and deep amortized approximations. Furthermore, we numerically investigate entropy-based annealing which results in improved learning. Our main contributions are theoretical, however, and they are twofold: (1) we provide the first demonstration on how a recently shown convergence of the ELBO to entropy sums can be used for learning; and (2) using the entropy objective, we derive a fully analytic ELBO objective for the standard sparse coding generative model.} }
Endnote
%0 Conference Paper %T Learning Sparse Codes with Entropy-Based ELBOs %A Dmytro Velychko %A Simon Damm %A Asja Fischer %A Jörg Lücke %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-velychko24a %I PMLR %P 2089--2097 %U https://proceedings.mlr.press/v238/velychko24a.html %V 238 %X Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference; (B) the novel objective is fully analytic; and (C) the objective allows for a novel principled form of annealing. The objective is derived by first showing that the standard ELBO objective converges to a sum of entropies, which matches similar recent results for generative models with Gaussian priors. The conditions under which the ELBO becomes equal to entropies are then shown to have analytic solutions, which leads to the fully analytic objective. Numerical experiments are used to demonstrate the feasibility of learning with such entropy-based ELBOs. We investigate different posterior approximations including Gaussians with correlated latents and deep amortized approximations. Furthermore, we numerically investigate entropy-based annealing which results in improved learning. Our main contributions are theoretical, however, and they are twofold: (1) we provide the first demonstration on how a recently shown convergence of the ELBO to entropy sums can be used for learning; and (2) using the entropy objective, we derive a fully analytic ELBO objective for the standard sparse coding generative model.
APA
Velychko, D., Damm, S., Fischer, A. & Lücke, J.. (2024). Learning Sparse Codes with Entropy-Based ELBOs. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:2089-2097 Available from https://proceedings.mlr.press/v238/velychko24a.html.

Related Material