Inference under Information Constraints: Lower Bounds from Chi-Square Contraction

Jayadev Acharya; Clément L Canonne; Himanshu Tyagi

Inference under Information Constraints: Lower Bounds from Chi-Square Contraction

Jayadev Acharya, Clément L Canonne, Himanshu Tyagi

Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:3-17, 2019.

Abstract

Multiple users getting one sample each from an unknown distribution seek to enable a central server to conduct statistical inference. However, each player can only provide limited amount of information about its sample to the server. We propose a unified framework to study such distributed inference problems under local information constraints. We model the local information constraints by a set of channels $\mathcal{W}$: each player chooses a channel from $\mathcal{W}$, and then passes their data through this channel before transmitting the output to the server. The goal in this distributed setting is to understand the blow-up in data requirement imposed by the information constraints, compared to the centralized setting where all data samples are available to the server. We introduce two notions of \emph{chi-square fluctuations} which provide bounds for the average distance and the distance to the average of a local perturbation. When information constraints are imposed, by the standard data-processing inequality, pairwise distances contract and so do our chi-square fluctuations. We provide a precise characterization of this contraction for discrete $k$-ary distributions and use it to obtain to general lower bounds for distribution learning and testing under information constraints. Our results involve notions of minmax and maxmin chi-square fluctuations, where the maximum is over the choice of channels and the minimum is over perturbations. The former emerges when considering public-coin protocols for testing and is bounded in terms of Frobenius norm of a positive semidefinite matrix $H$, a function of the channel family $\mathcal{W}$. The latter appears for private-coin protocols and is bounded by the nuclear norm of $H$ which is smaller than its Frobenius norm, establishing a separation between the sample complexity of testing using public and private coins.

Cite this Paper

BibTeX


@InProceedings{pmlr-v99-acharya19a,
  title = 	 {Inference under Information Constraints: Lower Bounds from Chi-Square Contraction},
  author =       {Acharya, Jayadev and Canonne, Cl{\'{e}}ment L and Tyagi, Himanshu},
  booktitle = 	 {Proceedings of the Thirty-Second Conference on Learning Theory},
  pages = 	 {3--17},
  year = 	 {2019},
  editor = 	 {Beygelzimer, Alina and Hsu, Daniel},
  volume = 	 {99},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--28 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v99/acharya19a/acharya19a.pdf},
  url = 	 {https://proceedings.mlr.press/v99/acharya19a.html},
  abstract = 	 {Multiple users getting one sample each from an unknown  distribution seek to enable a central server to conduct statistical inference. However, each player can only provide limited amount of information about its sample to the server.  We propose a unified framework to study such distributed inference problems under local information constraints.  We model the local information constraints by a set of channels $\mathcal{W}$: each player chooses a channel from $\mathcal{W}$, and then passes their data through this channel before transmitting the output to the server. The goal in this distributed setting is to understand the blow-up in data requirement imposed by the information constraints, compared to the centralized setting where all data samples are available to the server.  We introduce two notions of \emph{chi-square fluctuations} which provide bounds for the  average distance and the distance to the  average of a local perturbation. When information constraints are imposed, by the standard data-processing inequality, pairwise distances contract and so do our chi-square fluctuations. We provide a precise characterization of this contraction for discrete $k$-ary distributions and use it to obtain to general lower bounds for distribution learning and testing under information constraints. Our results involve  notions of minmax and maxmin chi-square fluctuations, where  the maximum is over the choice of channels and the minimum is over perturbations. The former emerges when considering public-coin protocols for testing and is bounded in terms of Frobenius norm of a positive semidefinite matrix $H$, a function of the channel family $\mathcal{W}$. The latter appears for private-coin protocols and is bounded by the nuclear norm of $H$ which is smaller than its Frobenius norm, establishing a separation between the sample complexity of testing using public and private coins.}
}

Endnote

%0 Conference Paper
%T Inference under Information Constraints: Lower Bounds from Chi-Square Contraction
%A Jayadev Acharya
%A Clément L Canonne
%A Himanshu Tyagi
%B Proceedings of the Thirty-Second Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2019
%E Alina Beygelzimer
%E Daniel Hsu	
%F pmlr-v99-acharya19a
%I PMLR
%P 3--17
%U https://proceedings.mlr.press/v99/acharya19a.html
%V 99
%X Multiple users getting one sample each from an unknown  distribution seek to enable a central server to conduct statistical inference. However, each player can only provide limited amount of information about its sample to the server.  We propose a unified framework to study such distributed inference problems under local information constraints.  We model the local information constraints by a set of channels $\mathcal{W}$: each player chooses a channel from $\mathcal{W}$, and then passes their data through this channel before transmitting the output to the server. The goal in this distributed setting is to understand the blow-up in data requirement imposed by the information constraints, compared to the centralized setting where all data samples are available to the server.  We introduce two notions of \emph{chi-square fluctuations} which provide bounds for the  average distance and the distance to the  average of a local perturbation. When information constraints are imposed, by the standard data-processing inequality, pairwise distances contract and so do our chi-square fluctuations. We provide a precise characterization of this contraction for discrete $k$-ary distributions and use it to obtain to general lower bounds for distribution learning and testing under information constraints. Our results involve  notions of minmax and maxmin chi-square fluctuations, where  the maximum is over the choice of channels and the minimum is over perturbations. The former emerges when considering public-coin protocols for testing and is bounded in terms of Frobenius norm of a positive semidefinite matrix $H$, a function of the channel family $\mathcal{W}$. The latter appears for private-coin protocols and is bounded by the nuclear norm of $H$ which is smaller than its Frobenius norm, establishing a separation between the sample complexity of testing using public and private coins.

APA


Acharya, J., Canonne, C.L. & Tyagi, H.. (2019). Inference under Information Constraints: Lower Bounds from Chi-Square Contraction. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:3-17 Available from https://proceedings.mlr.press/v99/acharya19a.html.

Related Material

Download PDF