Maximum Entropy Distributions: Bit Complexity and Stability
[edit]
Proceedings of the ThirtySecond Conference on Learning Theory, PMLR 99:28612891, 2019.
Abstract
Maximum entropy distributions with discrete support in $m$ dimensions arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of maxentropy distributions have been extensively studied, basic questions such as: Do maxentropy distributions over a large support (e.g., $2^m$) with a specified marginal vector have succinct descriptions (polynomialsize in the input description)? and: Are entropy maximizing distributions “stable” under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows a ${\rm poly}(m, \log 1/\varepsilon)$ bound on the bit complexity of $\varepsilon$optimal dual solutions to the maximum entropy convex program – for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute maxentropy distributions over several new and old polytopes for any marginal vector in a unified manner, a polynomial time algorithm to compute the BrascampLieb constant in the rank1 case. The proof of this result allows us to show that changing the marginal vector by $\delta$ changes the maxentropy distribution in the total variation distance roughly by a factor of ${\rm poly}(m, \log 1/\delta)\sqrt{\delta}$ – even when the size of the support set is exponential. Together, our results put maxentropy distributions on a mathematically sound footing – these distributions are robust and computationally feasible models for data.
Related Material


