Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

Vitaly Feldman, David Xiao
Proceedings of The 27th Conference on Learning Theory, PMLR 35:1000-1019, 2014.

Abstract

In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open (Beimel et al. 2010; Chaudhuri and Hsu, 2011; Beimel et al., 2013a,b). Our main contribution is an equivalence between the sample complexity of differentially-private learning of a concept class C (or \mathrmSCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from C. Using this equivalence we prove the following bounds: \beginitemize \item \mathrmSCDP(C) = Ω(\mathrmLDim(C)), where \mathrmLDim(C) is the Littlestone’s (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. This result implies that \mathrmSCDP(C) is different from the VC-dimension of C, resolving one of the main open questions from prior work. \item For any t, there exists a class C such that \mathrmLDim(C)=2 but \mathrmSCDP(C) ≥t. \item For any t, there exists a class C such that the sample complexity of (pure) α-differentially private PAC learning is Ω(t/α) but the sample complexity of the relaxed (α,β)-differentially private PAC learning is O(\log(1/β)/α). This resolves an open problem from (Beimel et al., 2013b). \enditemize We also obtain simpler proofs for a number of known related results. Our equivalence builds on a characterization of sample complexity by Beimel et al., (2013a) and our bounds rely on a number of known results from communication complexity.

Cite this Paper


BibTeX
@InProceedings{pmlr-v35-feldman14b, title = {Sample Complexity Bounds on Differentially Private Learning via Communication Complexity}, author = {Feldman, Vitaly and Xiao, David}, booktitle = {Proceedings of The 27th Conference on Learning Theory}, pages = {1000--1019}, year = {2014}, editor = {Balcan, Maria Florina and Feldman, Vitaly and Szepesvári, Csaba}, volume = {35}, series = {Proceedings of Machine Learning Research}, address = {Barcelona, Spain}, month = {13--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v35/feldman14b.pdf}, url = {https://proceedings.mlr.press/v35/feldman14b.html}, abstract = {In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open (Beimel et al. 2010; Chaudhuri and Hsu, 2011; Beimel et al., 2013a,b). Our main contribution is an equivalence between the sample complexity of differentially-private learning of a concept class C (or \mathrmSCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from C. Using this equivalence we prove the following bounds: \beginitemize \item \mathrmSCDP(C) = Ω(\mathrmLDim(C)), where \mathrmLDim(C) is the Littlestone’s (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. This result implies that \mathrmSCDP(C) is different from the VC-dimension of C, resolving one of the main open questions from prior work. \item For any t, there exists a class C such that \mathrmLDim(C)=2 but \mathrmSCDP(C) ≥t. \item For any t, there exists a class C such that the sample complexity of (pure) α-differentially private PAC learning is Ω(t/α) but the sample complexity of the relaxed (α,β)-differentially private PAC learning is O(\log(1/β)/α). This resolves an open problem from (Beimel et al., 2013b). \enditemize We also obtain simpler proofs for a number of known related results. Our equivalence builds on a characterization of sample complexity by Beimel et al., (2013a) and our bounds rely on a number of known results from communication complexity.} }
Endnote
%0 Conference Paper %T Sample Complexity Bounds on Differentially Private Learning via Communication Complexity %A Vitaly Feldman %A David Xiao %B Proceedings of The 27th Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2014 %E Maria Florina Balcan %E Vitaly Feldman %E Csaba Szepesvári %F pmlr-v35-feldman14b %I PMLR %P 1000--1019 %U https://proceedings.mlr.press/v35/feldman14b.html %V 35 %X In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open (Beimel et al. 2010; Chaudhuri and Hsu, 2011; Beimel et al., 2013a,b). Our main contribution is an equivalence between the sample complexity of differentially-private learning of a concept class C (or \mathrmSCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from C. Using this equivalence we prove the following bounds: \beginitemize \item \mathrmSCDP(C) = Ω(\mathrmLDim(C)), where \mathrmLDim(C) is the Littlestone’s (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. This result implies that \mathrmSCDP(C) is different from the VC-dimension of C, resolving one of the main open questions from prior work. \item For any t, there exists a class C such that \mathrmLDim(C)=2 but \mathrmSCDP(C) ≥t. \item For any t, there exists a class C such that the sample complexity of (pure) α-differentially private PAC learning is Ω(t/α) but the sample complexity of the relaxed (α,β)-differentially private PAC learning is O(\log(1/β)/α). This resolves an open problem from (Beimel et al., 2013b). \enditemize We also obtain simpler proofs for a number of known related results. Our equivalence builds on a characterization of sample complexity by Beimel et al., (2013a) and our bounds rely on a number of known results from communication complexity.
RIS
TY - CPAPER TI - Sample Complexity Bounds on Differentially Private Learning via Communication Complexity AU - Vitaly Feldman AU - David Xiao BT - Proceedings of The 27th Conference on Learning Theory DA - 2014/05/29 ED - Maria Florina Balcan ED - Vitaly Feldman ED - Csaba Szepesvári ID - pmlr-v35-feldman14b PB - PMLR DP - Proceedings of Machine Learning Research VL - 35 SP - 1000 EP - 1019 L1 - http://proceedings.mlr.press/v35/feldman14b.pdf UR - https://proceedings.mlr.press/v35/feldman14b.html AB - In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. (2006) that ensures that the output of an algorithm leaks little information about the data point provided by any of the participating individuals. Sample complexity of private PAC and agnostic learning was studied in a number of prior works starting with (Kasiviswanathan et al., 2008) but a number of basic questions still remain open (Beimel et al. 2010; Chaudhuri and Hsu, 2011; Beimel et al., 2013a,b). Our main contribution is an equivalence between the sample complexity of differentially-private learning of a concept class C (or \mathrmSCDP(C)) and the randomized one-way communication complexity of the evaluation problem for concepts from C. Using this equivalence we prove the following bounds: \beginitemize \item \mathrmSCDP(C) = Ω(\mathrmLDim(C)), where \mathrmLDim(C) is the Littlestone’s (1987) dimension characterizing the number of mistakes in the online-mistake-bound learning model. This result implies that \mathrmSCDP(C) is different from the VC-dimension of C, resolving one of the main open questions from prior work. \item For any t, there exists a class C such that \mathrmLDim(C)=2 but \mathrmSCDP(C) ≥t. \item For any t, there exists a class C such that the sample complexity of (pure) α-differentially private PAC learning is Ω(t/α) but the sample complexity of the relaxed (α,β)-differentially private PAC learning is O(\log(1/β)/α). This resolves an open problem from (Beimel et al., 2013b). \enditemize We also obtain simpler proofs for a number of known related results. Our equivalence builds on a characterization of sample complexity by Beimel et al., (2013a) and our bounds rely on a number of known results from communication complexity. ER -
APA
Feldman, V. & Xiao, D.. (2014). Sample Complexity Bounds on Differentially Private Learning via Communication Complexity. Proceedings of The 27th Conference on Learning Theory, in Proceedings of Machine Learning Research 35:1000-1019 Available from https://proceedings.mlr.press/v35/feldman14b.html.

Related Material