Distributed Learning, Communication Complexity and Privacy

Maria Florina Balcan; Avrim Blum; Shai Fine; Yishay Mansour

Distributed Learning, Communication Complexity and Privacy

Maria Florina Balcan, Avrim Blum, Shai Fine, Yishay Mansour

Proceedings of the 25th Annual Conference on Learning Theory, PMLR 23:26.1-26.22, 2012.

Abstract

We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. We also present tight results for a number of common concept classes including conjunctions, parity functions, and decision lists. For linear separators, we show that for non-concentrated distributions, we can use a version of the Perceptron algorithm to learn with much less communication than the number of updates given by the usual margin bound. We also show how boosting can be performed in a generic manner in the distributed setting to achieve communication with only logarithmic dependence on 1/ε for any concept class, and demonstrate how recent work on agnostic learning from class-conditional queries can be used to achieve low communication in agnostic settings as well. We additionally present an analysis of privacy, considering both differential privacy and a notion of distributional privacy that is especially appealing in this context.

Cite this Paper

BibTeX


@InProceedings{pmlr-v23-balcan12a,
  title = 	 {Distributed Learning, Communication Complexity and Privacy},
  author = 	 {Balcan, Maria Florina and Blum, Avrim and Fine, Shai and Mansour, Yishay},
  booktitle = 	 {Proceedings of the 25th Annual Conference on Learning Theory},
  pages = 	 {26.1--26.22},
  year = 	 {2012},
  editor = 	 {Mannor, Shie and Srebro, Nathan and Williamson, Robert C.},
  volume = 	 {23},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Edinburgh, Scotland},
  month = 	 {25--27 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v23/balcan12a/balcan12a.pdf},
  url = 	 {https://proceedings.mlr.press/v23/balcan12a.html},
  abstract = 	 {We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. We also present tight results for a number of common concept classes including conjunctions, parity functions, and decision lists. For linear separators, we show that for non-concentrated distributions, we can use a version of the Perceptron algorithm to learn with much less communication than the number of updates given by the usual margin bound. We also show how boosting can be performed in a generic manner in the distributed setting to achieve communication with only logarithmic dependence on 1/ε for any concept class, and demonstrate how recent work on agnostic learning from class-conditional queries can be used to achieve low communication in agnostic settings as well. We additionally present an analysis of privacy, considering both differential privacy and a notion of distributional privacy that is especially appealing in this context.}
}

Endnote

%0 Conference Paper
%T Distributed Learning, Communication Complexity and Privacy
%A Maria Florina Balcan
%A Avrim Blum
%A Shai Fine
%A Yishay Mansour
%B Proceedings of the 25th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2012
%E Shie Mannor
%E Nathan Srebro
%E Robert C. Williamson	
%F pmlr-v23-balcan12a
%I PMLR
%P 26.1--26.22
%U https://proceedings.mlr.press/v23/balcan12a.html
%V 23
%X We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. We also present tight results for a number of common concept classes including conjunctions, parity functions, and decision lists. For linear separators, we show that for non-concentrated distributions, we can use a version of the Perceptron algorithm to learn with much less communication than the number of updates given by the usual margin bound. We also show how boosting can be performed in a generic manner in the distributed setting to achieve communication with only logarithmic dependence on 1/ε for any concept class, and demonstrate how recent work on agnostic learning from class-conditional queries can be used to achieve low communication in agnostic settings as well. We additionally present an analysis of privacy, considering both differential privacy and a notion of distributional privacy that is especially appealing in this context.

RIS


TY  - CPAPER
TI  - Distributed Learning, Communication Complexity and Privacy
AU  - Maria Florina Balcan
AU  - Avrim Blum
AU  - Shai Fine
AU  - Yishay Mansour
BT  - Proceedings of the 25th Annual Conference on Learning Theory
DA  - 2012/06/16
ED  - Shie Mannor
ED  - Nathan Srebro
ED  - Robert C. Williamson	
ID  - pmlr-v23-balcan12a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 23
SP  - 26.1
EP  - 26.22
L1  - http://proceedings.mlr.press/v23/balcan12a/balcan12a.pdf
UR  - https://proceedings.mlr.press/v23/balcan12a.html
AB  - We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. We also present tight results for a number of common concept classes including conjunctions, parity functions, and decision lists. For linear separators, we show that for non-concentrated distributions, we can use a version of the Perceptron algorithm to learn with much less communication than the number of updates given by the usual margin bound. We also show how boosting can be performed in a generic manner in the distributed setting to achieve communication with only logarithmic dependence on 1/ε for any concept class, and demonstrate how recent work on agnostic learning from class-conditional queries can be used to achieve low communication in agnostic settings as well. We additionally present an analysis of privacy, considering both differential privacy and a notion of distributional privacy that is especially appealing in this context.
ER  -

APA


Balcan, M.F., Blum, A., Fine, S. & Mansour, Y.. (2012). Distributed Learning, Communication Complexity and Privacy. Proceedings of the 25th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 23:26.1-26.22 Available from https://proceedings.mlr.press/v23/balcan12a.html.

Related Material

Download PDF