[edit]
Geometrically Regularized Wasserstein Dictionary Learning
Proceedings of 2nd Annual Workshop on Topology, Algebra, and Geometry in Machine Learning (TAG-ML), PMLR 221:384-403, 2023.
Abstract
Wasserstein dictionary learning is an unsupervised approach to learning a collection of probability distributions that generate observed distributions as Wasserstein barycentric combinations. Existing methods solve an optimization problem that only seeks a dictionary and weights that minimize the reconstruction accuracy. However, there is no a priori reason to believe there are unique solutions in general to this problem. Moreover, the learned dictionary is, by design, optimized to represent the observed data set, and may not be useful for classification tasks or generative modeling. Just as regularization plays a key role in linear dictionary learning, we propose a geometric regularizer for Wasserstein space that promotes representations of a data distributions using nearby dictionary elements. We show that this regularizer leads to barycentric weights that concentrate on dictionary atoms local to each data distribution. When data are generated as Wasserstein barycenters of fixed distributions, this regularizer facilitates the recovery of the generating distributions in cases that are ill-posed for unregularized Wasserstein dictionary learning. Through experimentation on synthetic and real data, we show that our geometrically regularized approach yields more interpretable dictionaries in Wasserstein space which perform better in downstream applications.