Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso

Abhradeep Guha Thakurta; Adam Smith

Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso

Abhradeep Guha Thakurta, Adam Smith

Proceedings of the 26th Annual Conference on Learning Theory, PMLR 30:819-850, 2013.

Abstract

We design differentially private algorithms for statistical model selection. Given a data set and a large, discrete collection of “models”, each of which is a family of probability distributions, the goal is to determine the model that best “fits” the data. This is a basic problem in many areas of statistics and machine learning. We consider settings in which there is a well-defined answer, in the following sense: Suppose that there is a \emphnonprivate model selection procedure f, which is the reference to which we compare our performance. Our differentially private algorithms output the correct value f(D) whenever f is \emphstable on the input data set D. We work with two notions, \emphperturbation stability and \emphsub-sampling stability. We give two classes of results: generic ones, that apply to any function with discrete output set; and specific algorithms for the problem of sparse linear regression. The algorithms we describe are efficient and in some cases match the optimal \emphnon-private asymptotic sample complexity. Our algorithms for sparse linear regression require analyzing the stability properties of the popular LASSO estimator. We give sufficient conditions for the LASSO estimator to be robust to small changes in the data set, and show that these conditions hold with high probability under essentially the same stochastic assumptions that are used in the literature to analyze convergence of the LASSO.

Cite this Paper

BibTeX


@InProceedings{pmlr-v30-Guha13,
  title = 	 {Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso},
  author = 	 {Thakurta, Abhradeep Guha and Smith, Adam},
  booktitle = 	 {Proceedings of the 26th Annual Conference on Learning Theory},
  pages = 	 {819--850},
  year = 	 {2013},
  editor = 	 {Shalev-Shwartz, Shai and Steinwart, Ingo},
  volume = 	 {30},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Princeton, NJ, USA},
  month = 	 {12--14 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v30/Guha13.pdf},
  url = 	 {https://proceedings.mlr.press/v30/Guha13.html},
  abstract = 	 {We design differentially private algorithms for statistical model selection. Given a data set and a large, discrete collection of  “models”, each of which is a family of probability distributions, the goal is to determine the model that best “fits” the data. This is a basic problem in many areas of statistics and machine learning. We consider settings in which there is a well-defined answer, in  the following sense: Suppose that there is a \emphnonprivate model  selection procedure f, which is the reference to which we compare our performance. Our differentially private algorithms output the correct value  f(D) whenever f is \emphstable on the input data set D. We work with two notions, \emphperturbation stability and \emphsub-sampling stability. We give two classes of results: generic ones, that apply to any function with discrete output set; and specific algorithms for the problem of sparse linear regression. The algorithms we describe are efficient and in some cases match the optimal \emphnon-private asymptotic sample complexity. Our algorithms for sparse linear regression require analyzing the stability properties of the popular LASSO estimator. We give sufficient conditions for the LASSO estimator to be robust to small changes in the data set, and show that these conditions hold with high probability under essentially the same stochastic assumptions that are used in the literature to analyze convergence of the LASSO.}
}

Endnote

%0 Conference Paper
%T Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso
%A Abhradeep Guha Thakurta
%A Adam Smith
%B Proceedings of the 26th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2013
%E Shai Shalev-Shwartz
%E Ingo Steinwart	
%F pmlr-v30-Guha13
%I PMLR
%P 819--850
%U https://proceedings.mlr.press/v30/Guha13.html
%V 30
%X We design differentially private algorithms for statistical model selection. Given a data set and a large, discrete collection of  “models”, each of which is a family of probability distributions, the goal is to determine the model that best “fits” the data. This is a basic problem in many areas of statistics and machine learning. We consider settings in which there is a well-defined answer, in  the following sense: Suppose that there is a \emphnonprivate model  selection procedure f, which is the reference to which we compare our performance. Our differentially private algorithms output the correct value  f(D) whenever f is \emphstable on the input data set D. We work with two notions, \emphperturbation stability and \emphsub-sampling stability. We give two classes of results: generic ones, that apply to any function with discrete output set; and specific algorithms for the problem of sparse linear regression. The algorithms we describe are efficient and in some cases match the optimal \emphnon-private asymptotic sample complexity. Our algorithms for sparse linear regression require analyzing the stability properties of the popular LASSO estimator. We give sufficient conditions for the LASSO estimator to be robust to small changes in the data set, and show that these conditions hold with high probability under essentially the same stochastic assumptions that are used in the literature to analyze convergence of the LASSO.

RIS


TY  - CPAPER
TI  - Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso
AU  - Abhradeep Guha Thakurta
AU  - Adam Smith
BT  - Proceedings of the 26th Annual Conference on Learning Theory
DA  - 2013/06/13
ED  - Shai Shalev-Shwartz
ED  - Ingo Steinwart	
ID  - pmlr-v30-Guha13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 30
SP  - 819
EP  - 850
L1  - http://proceedings.mlr.press/v30/Guha13.pdf
UR  - https://proceedings.mlr.press/v30/Guha13.html
AB  - We design differentially private algorithms for statistical model selection. Given a data set and a large, discrete collection of  “models”, each of which is a family of probability distributions, the goal is to determine the model that best “fits” the data. This is a basic problem in many areas of statistics and machine learning. We consider settings in which there is a well-defined answer, in  the following sense: Suppose that there is a \emphnonprivate model  selection procedure f, which is the reference to which we compare our performance. Our differentially private algorithms output the correct value  f(D) whenever f is \emphstable on the input data set D. We work with two notions, \emphperturbation stability and \emphsub-sampling stability. We give two classes of results: generic ones, that apply to any function with discrete output set; and specific algorithms for the problem of sparse linear regression. The algorithms we describe are efficient and in some cases match the optimal \emphnon-private asymptotic sample complexity. Our algorithms for sparse linear regression require analyzing the stability properties of the popular LASSO estimator. We give sufficient conditions for the LASSO estimator to be robust to small changes in the data set, and show that these conditions hold with high probability under essentially the same stochastic assumptions that are used in the literature to analyze convergence of the LASSO.
ER  -

APA


Thakurta, A.G. & Smith, A.. (2013). Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso. Proceedings of the 26th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 30:819-850 Available from https://proceedings.mlr.press/v30/Guha13.html.

Related Material

Download PDF