Indeterminacy in Generative Models: Characterization and Strong Identifiability

Quanhan Xi, Benjamin Bloem-Reddy
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:6912-6939, 2023.

Abstract

Most modern probabilistic generative models, such as the variational autoencoder (VAE), have certain indeterminacies that are unresolvable even with an infinite amount of data. Different tasks tolerate different indeterminacies, however recent applications have indicated the need for strongly identifiable models, in which an observation corresponds to a unique latent code. Progress has been made towards reducing model indeterminacies while maintaining flexibility, and recent work excludes many—but not all—indeterminacies. In this work, we motivate model-identifiability in terms of task-identifiability, then construct a theoretical framework for analyzing the indeterminacies of latent variable models, which enables their precise characterization in terms of the generator function and prior distribution spaces. We reveal that strong identifiability is possible even with highly flexible nonlinear generators, and give two such examples. One is a straightforward modification of iVAE (Khemakhem et al., 2020); the other uses triangular monotonic maps, leading to novel connections between optimal transport and identifiability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-xi23a, title = {Indeterminacy in Generative Models: Characterization and Strong Identifiability}, author = {Xi, Quanhan and Bloem-Reddy, Benjamin}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {6912--6939}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/xi23a/xi23a.pdf}, url = {https://proceedings.mlr.press/v206/xi23a.html}, abstract = {Most modern probabilistic generative models, such as the variational autoencoder (VAE), have certain indeterminacies that are unresolvable even with an infinite amount of data. Different tasks tolerate different indeterminacies, however recent applications have indicated the need for strongly identifiable models, in which an observation corresponds to a unique latent code. Progress has been made towards reducing model indeterminacies while maintaining flexibility, and recent work excludes many—but not all—indeterminacies. In this work, we motivate model-identifiability in terms of task-identifiability, then construct a theoretical framework for analyzing the indeterminacies of latent variable models, which enables their precise characterization in terms of the generator function and prior distribution spaces. We reveal that strong identifiability is possible even with highly flexible nonlinear generators, and give two such examples. One is a straightforward modification of iVAE (Khemakhem et al., 2020); the other uses triangular monotonic maps, leading to novel connections between optimal transport and identifiability.} }
Endnote
%0 Conference Paper %T Indeterminacy in Generative Models: Characterization and Strong Identifiability %A Quanhan Xi %A Benjamin Bloem-Reddy %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-xi23a %I PMLR %P 6912--6939 %U https://proceedings.mlr.press/v206/xi23a.html %V 206 %X Most modern probabilistic generative models, such as the variational autoencoder (VAE), have certain indeterminacies that are unresolvable even with an infinite amount of data. Different tasks tolerate different indeterminacies, however recent applications have indicated the need for strongly identifiable models, in which an observation corresponds to a unique latent code. Progress has been made towards reducing model indeterminacies while maintaining flexibility, and recent work excludes many—but not all—indeterminacies. In this work, we motivate model-identifiability in terms of task-identifiability, then construct a theoretical framework for analyzing the indeterminacies of latent variable models, which enables their precise characterization in terms of the generator function and prior distribution spaces. We reveal that strong identifiability is possible even with highly flexible nonlinear generators, and give two such examples. One is a straightforward modification of iVAE (Khemakhem et al., 2020); the other uses triangular monotonic maps, leading to novel connections between optimal transport and identifiability.
APA
Xi, Q. & Bloem-Reddy, B.. (2023). Indeterminacy in Generative Models: Characterization and Strong Identifiability. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:6912-6939 Available from https://proceedings.mlr.press/v206/xi23a.html.

Related Material