Maximum Entropy Distributions: Bit Complexity and Stability

Damian Straszak, Nisheeth K. Vishnoi
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:2861-2891, 2019.

Abstract

Maximum entropy distributions with discrete support in $m$ dimensions arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of max-entropy distributions have been extensively studied, basic questions such as: Do max-entropy distributions over a large support (e.g., $2^m$) with a specified marginal vector have succinct descriptions (polynomial-size in the input description)? and: Are entropy maximizing distributions “stable” under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows a ${\rm poly}(m, \log 1/\varepsilon)$ bound on the bit complexity of $\varepsilon$-optimal dual solutions to the maximum entropy convex program – for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute max-entropy distributions over several new and old polytopes for any marginal vector in a unified manner, a polynomial time algorithm to compute the Brascamp-Lieb constant in the rank-1 case. The proof of this result allows us to show that changing the marginal vector by $\delta$ changes the max-entropy distribution in the total variation distance roughly by a factor of ${\rm poly}(m, \log 1/\delta)\sqrt{\delta}$ – even when the size of the support set is exponential. Together, our results put max-entropy distributions on a mathematically sound footing – these distributions are robust and computationally feasible models for data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v99-straszak19a, title = {Maximum Entropy Distributions: Bit Complexity and Stability}, author = {Straszak, Damian and Vishnoi, Nisheeth K.}, booktitle = {Proceedings of the Thirty-Second Conference on Learning Theory}, pages = {2861--2891}, year = {2019}, editor = {Beygelzimer, Alina and Hsu, Daniel}, volume = {99}, series = {Proceedings of Machine Learning Research}, month = {25--28 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v99/straszak19a/straszak19a.pdf}, url = {https://proceedings.mlr.press/v99/straszak19a.html}, abstract = {Maximum entropy distributions with discrete support in $m$ dimensions arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of max-entropy distributions have been extensively studied, basic questions such as: Do max-entropy distributions over a large support (e.g., $2^m$) with a specified marginal vector have succinct descriptions (polynomial-size in the input description)? and: Are entropy maximizing distributions “stable” under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows a ${\rm poly}(m, \log 1/\varepsilon)$ bound on the bit complexity of $\varepsilon$-optimal dual solutions to the maximum entropy convex program – for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute max-entropy distributions over several new and old polytopes for any marginal vector in a unified manner, a polynomial time algorithm to compute the Brascamp-Lieb constant in the rank-1 case. The proof of this result allows us to show that changing the marginal vector by $\delta$ changes the max-entropy distribution in the total variation distance roughly by a factor of ${\rm poly}(m, \log 1/\delta)\sqrt{\delta}$ – even when the size of the support set is exponential. Together, our results put max-entropy distributions on a mathematically sound footing – these distributions are robust and computationally feasible models for data.} }
Endnote
%0 Conference Paper %T Maximum Entropy Distributions: Bit Complexity and Stability %A Damian Straszak %A Nisheeth K. Vishnoi %B Proceedings of the Thirty-Second Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2019 %E Alina Beygelzimer %E Daniel Hsu %F pmlr-v99-straszak19a %I PMLR %P 2861--2891 %U https://proceedings.mlr.press/v99/straszak19a.html %V 99 %X Maximum entropy distributions with discrete support in $m$ dimensions arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of max-entropy distributions have been extensively studied, basic questions such as: Do max-entropy distributions over a large support (e.g., $2^m$) with a specified marginal vector have succinct descriptions (polynomial-size in the input description)? and: Are entropy maximizing distributions “stable” under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows a ${\rm poly}(m, \log 1/\varepsilon)$ bound on the bit complexity of $\varepsilon$-optimal dual solutions to the maximum entropy convex program – for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute max-entropy distributions over several new and old polytopes for any marginal vector in a unified manner, a polynomial time algorithm to compute the Brascamp-Lieb constant in the rank-1 case. The proof of this result allows us to show that changing the marginal vector by $\delta$ changes the max-entropy distribution in the total variation distance roughly by a factor of ${\rm poly}(m, \log 1/\delta)\sqrt{\delta}$ – even when the size of the support set is exponential. Together, our results put max-entropy distributions on a mathematically sound footing – these distributions are robust and computationally feasible models for data.
APA
Straszak, D. & Vishnoi, N.K.. (2019). Maximum Entropy Distributions: Bit Complexity and Stability. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:2861-2891 Available from https://proceedings.mlr.press/v99/straszak19a.html.

Related Material