[edit]
Maximum Entropy Distributions: Bit Complexity and Stability
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:2861-2891, 2019.
Abstract
Maximum entropy distributions with discrete support in m dimensions arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of max-entropy distributions have been extensively studied, basic questions such as: Do max-entropy distributions over a large support (e.g., 2m) with a specified marginal vector have succinct descriptions (polynomial-size in the input description)? and: Are entropy maximizing distributions “stable” under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows a poly(m,log1/ε) bound on the bit complexity of ε-optimal dual solutions to the maximum entropy convex program – for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute max-entropy distributions over several new and old polytopes for any marginal vector in a unified manner, a polynomial time algorithm to compute the Brascamp-Lieb constant in the rank-1 case. The proof of this result allows us to show that changing the marginal vector by δ changes the max-entropy distribution in the total variation distance roughly by a factor of poly(m,log1/δ)√δ – even when the size of the support set is exponential. Together, our results put max-entropy distributions on a mathematically sound footing – these distributions are robust and computationally feasible models for data.