Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

Vitaly Feldman; David Xiao

Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

Vitaly Feldman, David Xiao

Proceedings of The 27th Conference on Learning Theory, PMLR 35:1000-1019, 2014.

Abstract

In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open (Beimel et al. 2010; Chaudhuri and Hsu, 2011; Beimel et al., 2013a,b). Our main contribution is an equivalence between the sample complexity of differentially-private learning of a concept class C (or \mathrmSCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from C. Using this equivalence we prove the following bounds: \beginitemize \item \mathrmSCDP(C) = Ω(\mathrmLDim(C)), where \mathrmLDim(C) is the Littlestone’s (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. This result implies that \mathrmSCDP(C) is different from the VC-dimension of C, resolving one of the main open questions from prior work. \item For any t, there exists a class C such that \mathrmLDim(C)=2 but \mathrmSCDP(C) ≥t. \item For any t, there exists a class C such that the sample complexity of (pure) α-differentially private PAC learning is Ω(t/α) but the sample complexity of the relaxed (α,β)-differentially private PAC learning is O(\log(1/β)/α). This resolves an open problem from (Beimel et al., 2013b). \enditemize We also obtain simpler proofs for a number of known related results. Our equivalence builds on a characterization of sample complexity by Beimel et al., (2013a) and our bounds rely on a number of known results from communication complexity.

Cite this Paper

BibTeX


@InProceedings{pmlr-v35-feldman14b,
  title = 	 {Sample Complexity Bounds on Differentially Private Learning via Communication Complexity},
  author = 	 {Feldman, Vitaly and Xiao, David},
  booktitle = 	 {Proceedings of The 27th Conference on Learning Theory},
  pages = 	 {1000--1019},
  year = 	 {2014},
  editor = 	 {Balcan, Maria Florina and Feldman, Vitaly and Szepesvári, Csaba},
  volume = 	 {35},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Barcelona, Spain},
  month = 	 {13--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v35/feldman14b.pdf},
  url = 	 {https://proceedings.mlr.press/v35/feldman14b.html},
  abstract = 	 {In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open (Beimel et al. 2010; Chaudhuri and Hsu, 2011; Beimel et al., 2013a,b). Our main contribution is an equivalence between the sample complexity of differentially-private learning of a concept class C (or \mathrmSCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from C. Using this equivalence we prove the following bounds: \beginitemize \item \mathrmSCDP(C) = Ω(\mathrmLDim(C)), where \mathrmLDim(C) is the Littlestone’s (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. This result implies that \mathrmSCDP(C) is different from the VC-dimension of C, resolving one of the main open questions from prior work. \item For any t, there exists a class C such that \mathrmLDim(C)=2 but \mathrmSCDP(C) ≥t. \item For any t, there exists a class C such that the sample complexity of (pure) α-differentially private PAC learning is Ω(t/α) but the sample complexity of the relaxed (α,β)-differentially private PAC learning is O(\log(1/β)/α). This resolves an open problem from (Beimel et al., 2013b). \enditemize We also obtain simpler proofs for a number of known related results. Our equivalence builds on a characterization of sample complexity by Beimel et al., (2013a) and our bounds rely on a number of known results from communication complexity.}
}

Endnote

%0 Conference Paper
%T Sample Complexity Bounds on Differentially Private Learning via Communication Complexity
%A Vitaly Feldman
%A David Xiao
%B Proceedings of The 27th Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2014
%E Maria Florina Balcan
%E Vitaly Feldman
%E Csaba Szepesvári	
%F pmlr-v35-feldman14b
%I PMLR
%P 1000--1019
%U https://proceedings.mlr.press/v35/feldman14b.html
%V 35
%X In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open (Beimel et al. 2010; Chaudhuri and Hsu, 2011; Beimel et al., 2013a,b). Our main contribution is an equivalence between the sample complexity of differentially-private learning of a concept class C (or \mathrmSCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from C. Using this equivalence we prove the following bounds: \beginitemize \item \mathrmSCDP(C) = Ω(\mathrmLDim(C)), where \mathrmLDim(C) is the Littlestone’s (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. This result implies that \mathrmSCDP(C) is different from the VC-dimension of C, resolving one of the main open questions from prior work. \item For any t, there exists a class C such that \mathrmLDim(C)=2 but \mathrmSCDP(C) ≥t. \item For any t, there exists a class C such that the sample complexity of (pure) α-differentially private PAC learning is Ω(t/α) but the sample complexity of the relaxed (α,β)-differentially private PAC learning is O(\log(1/β)/α). This resolves an open problem from (Beimel et al., 2013b). \enditemize We also obtain simpler proofs for a number of known related results. Our equivalence builds on a characterization of sample complexity by Beimel et al., (2013a) and our bounds rely on a number of known results from communication complexity.

RIS


TY  - CPAPER
TI  - Sample Complexity Bounds on Differentially Private Learning via Communication Complexity
AU  - Vitaly Feldman
AU  - David Xiao
BT  - Proceedings of The 27th Conference on Learning Theory
DA  - 2014/05/29
ED  - Maria Florina Balcan
ED  - Vitaly Feldman
ED  - Csaba Szepesvári	
ID  - pmlr-v35-feldman14b
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 35
SP  - 1000
EP  - 1019
L1  - http://proceedings.mlr.press/v35/feldman14b.pdf
UR  - https://proceedings.mlr.press/v35/feldman14b.html
AB  - In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open (Beimel et al. 2010; Chaudhuri and Hsu, 2011; Beimel et al., 2013a,b). Our main contribution is an equivalence between the sample complexity of differentially-private learning of a concept class C (or \mathrmSCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from C. Using this equivalence we prove the following bounds: \beginitemize \item \mathrmSCDP(C) = Ω(\mathrmLDim(C)), where \mathrmLDim(C) is the Littlestone’s (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. This result implies that \mathrmSCDP(C) is different from the VC-dimension of C, resolving one of the main open questions from prior work. \item For any t, there exists a class C such that \mathrmLDim(C)=2 but \mathrmSCDP(C) ≥t. \item For any t, there exists a class C such that the sample complexity of (pure) α-differentially private PAC learning is Ω(t/α) but the sample complexity of the relaxed (α,β)-differentially private PAC learning is O(\log(1/β)/α). This resolves an open problem from (Beimel et al., 2013b). \enditemize We also obtain simpler proofs for a number of known related results. Our equivalence builds on a characterization of sample complexity by Beimel et al., (2013a) and our bounds rely on a number of known results from communication complexity.
ER  -

APA


Feldman, V. & Xiao, D.. (2014). Sample Complexity Bounds on Differentially Private Learning via Communication Complexity. Proceedings of The 27th Conference on Learning Theory, in Proceedings of Machine Learning Research 35:1000-1019 Available from https://proceedings.mlr.press/v35/feldman14b.html.

Related Material

Download PDF