Adaptive Data Analysis with Correlated Observations

Aryeh Kontorovich, Menachem Sadigurschi, Uri Stemmer
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:11483-11498, 2022.

Abstract

The vast majority of the work on adaptive data analysis focuses on the case where the samples in the dataset are independent. Several approaches and tools have been successfully applied in this context, such as differential privacy, max-information, compression arguments, and more. The situation is far less well-understood without the independence assumption. We embark on a systematic study of the possibilities of adaptive data analysis with correlated observations. First, we show that, in some cases, differential privacy guarantees generalization even when there are dependencies within the sample, which we quantify using a notion we call Gibbs-dependence. We complement this result with a tight negative example. % Second, we show that the connection between transcript-compression and adaptive data analysis can be extended to the non-iid setting.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-kontorovich22a, title = {Adaptive Data Analysis with Correlated Observations}, author = {Kontorovich, Aryeh and Sadigurschi, Menachem and Stemmer, Uri}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {11483--11498}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/kontorovich22a/kontorovich22a.pdf}, url = {https://proceedings.mlr.press/v162/kontorovich22a.html}, abstract = {The vast majority of the work on adaptive data analysis focuses on the case where the samples in the dataset are independent. Several approaches and tools have been successfully applied in this context, such as differential privacy, max-information, compression arguments, and more. The situation is far less well-understood without the independence assumption. We embark on a systematic study of the possibilities of adaptive data analysis with correlated observations. First, we show that, in some cases, differential privacy guarantees generalization even when there are dependencies within the sample, which we quantify using a notion we call Gibbs-dependence. We complement this result with a tight negative example. % Second, we show that the connection between transcript-compression and adaptive data analysis can be extended to the non-iid setting.} }
Endnote
%0 Conference Paper %T Adaptive Data Analysis with Correlated Observations %A Aryeh Kontorovich %A Menachem Sadigurschi %A Uri Stemmer %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-kontorovich22a %I PMLR %P 11483--11498 %U https://proceedings.mlr.press/v162/kontorovich22a.html %V 162 %X The vast majority of the work on adaptive data analysis focuses on the case where the samples in the dataset are independent. Several approaches and tools have been successfully applied in this context, such as differential privacy, max-information, compression arguments, and more. The situation is far less well-understood without the independence assumption. We embark on a systematic study of the possibilities of adaptive data analysis with correlated observations. First, we show that, in some cases, differential privacy guarantees generalization even when there are dependencies within the sample, which we quantify using a notion we call Gibbs-dependence. We complement this result with a tight negative example. % Second, we show that the connection between transcript-compression and adaptive data analysis can be extended to the non-iid setting.
APA
Kontorovich, A., Sadigurschi, M. & Stemmer, U.. (2022). Adaptive Data Analysis with Correlated Observations. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:11483-11498 Available from https://proceedings.mlr.press/v162/kontorovich22a.html.

Related Material