Maximum Entropy Distributions: Bit Complexity and Stability

Damian Straszak; Nisheeth K. Vishnoi

Maximum Entropy Distributions: Bit Complexity and Stability

Damian Straszak, Nisheeth K. Vishnoi

Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:2861-2891, 2019.

Abstract

Maximum entropy distributions with discrete support in

$m$ dimensions arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of max-entropy distributions have been extensively studied, basic questions such as: Do max-entropy distributions over a large support (e.g.,

$2^m$ ) with a specified marginal vector have succinct descriptions (polynomial-size in the input description)? and: Are entropy maximizing distributions “stable” under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows a

${\rm poly}(m, \log 1/\varepsilon)$ bound on the bit complexity of

$\varepsilon$ -optimal dual solutions to the maximum entropy convex program – for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute max-entropy distributions over several new and old polytopes for any marginal vector in a unified manner, a polynomial time algorithm to compute the Brascamp-Lieb constant in the rank-1 case. The proof of this result allows us to show that changing the marginal vector by

$\delta$ changes the max-entropy distribution in the total variation distance roughly by a factor of

${\rm poly}(m, \log 1/\delta)\sqrt{\delta}$ – even when the size of the support set is exponential. Together, our results put max-entropy distributions on a mathematically sound footing – these distributions are robust and computationally feasible models for data.

Cite this Paper

BibTeX


@InProceedings{pmlr-v99-straszak19a,
  title = 	 {Maximum Entropy Distributions: Bit Complexity and Stability},
  author =       {Straszak, Damian and Vishnoi, Nisheeth K.},
  booktitle = 	 {Proceedings of the Thirty-Second Conference on Learning Theory},
  pages = 	 {2861--2891},
  year = 	 {2019},
  editor = 	 {Beygelzimer, Alina and Hsu, Daniel},
  volume = 	 {99},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--28 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v99/straszak19a/straszak19a.pdf},
  url = 	 {https://proceedings.mlr.press/v99/straszak19a.html},
  abstract = 	 {Maximum entropy distributions with discrete support in $m$ dimensions  arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of max-entropy distributions have been extensively studied,  basic questions such as: Do max-entropy distributions over a large support  (e.g., $2^m$) with a specified marginal vector have succinct descriptions (polynomial-size in the input description)? and: Are entropy maximizing distributions “stable” under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows  a ${\rm poly}(m, \log 1/\varepsilon)$ bound on the bit complexity of $\varepsilon$-optimal dual solutions to the maximum entropy convex program – for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute  max-entropy distributions over several new and old polytopes for any marginal vector in a unified manner, a  polynomial time algorithm to compute the Brascamp-Lieb constant in the rank-1 case. The proof of this result allows us to show that changing the marginal vector by $\delta$ changes the max-entropy distribution in the total variation distance roughly by a factor of ${\rm poly}(m, \log 1/\delta)\sqrt{\delta}$ – even when the size of the support set is exponential. Together, our results put max-entropy distributions on a mathematically sound footing –  these distributions are  robust and computationally feasible models for data.}
}

Endnote

%0 Conference Paper
%T Maximum Entropy Distributions: Bit Complexity and Stability
%A Damian Straszak
%A Nisheeth K. Vishnoi
%B Proceedings of the Thirty-Second Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2019
%E Alina Beygelzimer
%E Daniel Hsu	
%F pmlr-v99-straszak19a
%I PMLR
%P 2861--2891
%U https://proceedings.mlr.press/v99/straszak19a.html
%V 99
%X Maximum entropy distributions with discrete support in $m$ dimensions  arise in machine learning, statistics, information theory, and theoretical computer science. While structural and computational properties of max-entropy distributions have been extensively studied,  basic questions such as: Do max-entropy distributions over a large support  (e.g., $2^m$) with a specified marginal vector have succinct descriptions (polynomial-size in the input description)? and: Are entropy maximizing distributions “stable” under the perturbation of the marginal vector? have resisted a rigorous resolution. Here we show that these questions are related and resolve both of them. Our main result shows  a ${\rm poly}(m, \log 1/\varepsilon)$ bound on the bit complexity of $\varepsilon$-optimal dual solutions to the maximum entropy convex program – for very general support sets and with no restriction on the marginal vector. Applications of this result include polynomial time algorithms to compute  max-entropy distributions over several new and old polytopes for any marginal vector in a unified manner, a  polynomial time algorithm to compute the Brascamp-Lieb constant in the rank-1 case. The proof of this result allows us to show that changing the marginal vector by $\delta$ changes the max-entropy distribution in the total variation distance roughly by a factor of ${\rm poly}(m, \log 1/\delta)\sqrt{\delta}$ – even when the size of the support set is exponential. Together, our results put max-entropy distributions on a mathematically sound footing –  these distributions are  robust and computationally feasible models for data.

APA


Straszak, D. & Vishnoi, N.K.. (2019). Maximum Entropy Distributions: Bit Complexity and Stability. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:2861-2891 Available from https://proceedings.mlr.press/v99/straszak19a.html.

Related Material

Download PDF