Inference under Information Constraints: Lower Bounds from Chi-Square Contraction

Jayadev Acharya, Clément L Canonne, Himanshu Tyagi
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:3-17, 2019.

Abstract

Multiple users getting one sample each from an unknown distribution seek to enable a central server to conduct statistical inference. However, each player can only provide limited amount of information about its sample to the server. We propose a unified framework to study such distributed inference problems under local information constraints. We model the local information constraints by a set of channels $\mathcal{W}$: each player chooses a channel from $\mathcal{W}$, and then passes their data through this channel before transmitting the output to the server. The goal in this distributed setting is to understand the blow-up in data requirement imposed by the information constraints, compared to the centralized setting where all data samples are available to the server. We introduce two notions of \emph{chi-square fluctuations} which provide bounds for the average distance and the distance to the average of a local perturbation. When information constraints are imposed, by the standard data-processing inequality, pairwise distances contract and so do our chi-square fluctuations. We provide a precise characterization of this contraction for discrete $k$-ary distributions and use it to obtain to general lower bounds for distribution learning and testing under information constraints. Our results involve notions of minmax and maxmin chi-square fluctuations, where the maximum is over the choice of channels and the minimum is over perturbations. The former emerges when considering public-coin protocols for testing and is bounded in terms of Frobenius norm of a positive semidefinite matrix $H$, a function of the channel family $\mathcal{W}$. The latter appears for private-coin protocols and is bounded by the nuclear norm of $H$ which is smaller than its Frobenius norm, establishing a separation between the sample complexity of testing using public and private coins.

Cite this Paper


BibTeX
@InProceedings{pmlr-v99-acharya19a, title = {Inference under Information Constraints: Lower Bounds from Chi-Square Contraction}, author = {Acharya, Jayadev and Canonne, Cl{\'{e}}ment L and Tyagi, Himanshu}, booktitle = {Proceedings of the Thirty-Second Conference on Learning Theory}, pages = {3--17}, year = {2019}, editor = {Beygelzimer, Alina and Hsu, Daniel}, volume = {99}, series = {Proceedings of Machine Learning Research}, month = {25--28 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v99/acharya19a/acharya19a.pdf}, url = {https://proceedings.mlr.press/v99/acharya19a.html}, abstract = {Multiple users getting one sample each from an unknown distribution seek to enable a central server to conduct statistical inference. However, each player can only provide limited amount of information about its sample to the server. We propose a unified framework to study such distributed inference problems under local information constraints. We model the local information constraints by a set of channels $\mathcal{W}$: each player chooses a channel from $\mathcal{W}$, and then passes their data through this channel before transmitting the output to the server. The goal in this distributed setting is to understand the blow-up in data requirement imposed by the information constraints, compared to the centralized setting where all data samples are available to the server. We introduce two notions of \emph{chi-square fluctuations} which provide bounds for the average distance and the distance to the average of a local perturbation. When information constraints are imposed, by the standard data-processing inequality, pairwise distances contract and so do our chi-square fluctuations. We provide a precise characterization of this contraction for discrete $k$-ary distributions and use it to obtain to general lower bounds for distribution learning and testing under information constraints. Our results involve notions of minmax and maxmin chi-square fluctuations, where the maximum is over the choice of channels and the minimum is over perturbations. The former emerges when considering public-coin protocols for testing and is bounded in terms of Frobenius norm of a positive semidefinite matrix $H$, a function of the channel family $\mathcal{W}$. The latter appears for private-coin protocols and is bounded by the nuclear norm of $H$ which is smaller than its Frobenius norm, establishing a separation between the sample complexity of testing using public and private coins.} }
Endnote
%0 Conference Paper %T Inference under Information Constraints: Lower Bounds from Chi-Square Contraction %A Jayadev Acharya %A Clément L Canonne %A Himanshu Tyagi %B Proceedings of the Thirty-Second Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2019 %E Alina Beygelzimer %E Daniel Hsu %F pmlr-v99-acharya19a %I PMLR %P 3--17 %U https://proceedings.mlr.press/v99/acharya19a.html %V 99 %X Multiple users getting one sample each from an unknown distribution seek to enable a central server to conduct statistical inference. However, each player can only provide limited amount of information about its sample to the server. We propose a unified framework to study such distributed inference problems under local information constraints. We model the local information constraints by a set of channels $\mathcal{W}$: each player chooses a channel from $\mathcal{W}$, and then passes their data through this channel before transmitting the output to the server. The goal in this distributed setting is to understand the blow-up in data requirement imposed by the information constraints, compared to the centralized setting where all data samples are available to the server. We introduce two notions of \emph{chi-square fluctuations} which provide bounds for the average distance and the distance to the average of a local perturbation. When information constraints are imposed, by the standard data-processing inequality, pairwise distances contract and so do our chi-square fluctuations. We provide a precise characterization of this contraction for discrete $k$-ary distributions and use it to obtain to general lower bounds for distribution learning and testing under information constraints. Our results involve notions of minmax and maxmin chi-square fluctuations, where the maximum is over the choice of channels and the minimum is over perturbations. The former emerges when considering public-coin protocols for testing and is bounded in terms of Frobenius norm of a positive semidefinite matrix $H$, a function of the channel family $\mathcal{W}$. The latter appears for private-coin protocols and is bounded by the nuclear norm of $H$ which is smaller than its Frobenius norm, establishing a separation between the sample complexity of testing using public and private coins.
APA
Acharya, J., Canonne, C.L. & Tyagi, H.. (2019). Inference under Information Constraints: Lower Bounds from Chi-Square Contraction. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:3-17 Available from https://proceedings.mlr.press/v99/acharya19a.html.

Related Material