Learning Entangled Single-Sample Distributions via Iterative Trimming

Hui Yuan, Yingyu Liang
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:2666-2676, 2020.

Abstract

In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of distributions, given one \emph{single} sample from each distribution. We study mean estimation and linear regression under general conditions, and analyze a simple and computationally efficient method based on iteratively trimming samples and re-estimating the parameter on the trimmed sample set. We show that the method in logarithmic iterations outputs an estimation whose error only depends on the noise level of the $\lceil \alpha n \rceil$-th noisiest data point where $\alpha$ is a constant and $n$ is the sample size. This means it can tolerate a constant fraction of high-noise points. These are the first such results under our general conditions with computationally efficient estimators. It also justifies the wide application and empirical success of iterative trimming in practice. Our theoretical results are complemented by experiments on synthetic data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-yuan20a, title = {Learning Entangled Single-Sample Distributions via Iterative Trimming}, author = {Yuan, Hui and Liang, Yingyu}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {2666--2676}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/yuan20a/yuan20a.pdf}, url = { http://proceedings.mlr.press/v108/yuan20a.html }, abstract = {In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of distributions, given one \emph{single} sample from each distribution. We study mean estimation and linear regression under general conditions, and analyze a simple and computationally efficient method based on iteratively trimming samples and re-estimating the parameter on the trimmed sample set. We show that the method in logarithmic iterations outputs an estimation whose error only depends on the noise level of the $\lceil \alpha n \rceil$-th noisiest data point where $\alpha$ is a constant and $n$ is the sample size. This means it can tolerate a constant fraction of high-noise points. These are the first such results under our general conditions with computationally efficient estimators. It also justifies the wide application and empirical success of iterative trimming in practice. Our theoretical results are complemented by experiments on synthetic data.} }
Endnote
%0 Conference Paper %T Learning Entangled Single-Sample Distributions via Iterative Trimming %A Hui Yuan %A Yingyu Liang %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-yuan20a %I PMLR %P 2666--2676 %U http://proceedings.mlr.press/v108/yuan20a.html %V 108 %X In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of distributions, given one \emph{single} sample from each distribution. We study mean estimation and linear regression under general conditions, and analyze a simple and computationally efficient method based on iteratively trimming samples and re-estimating the parameter on the trimmed sample set. We show that the method in logarithmic iterations outputs an estimation whose error only depends on the noise level of the $\lceil \alpha n \rceil$-th noisiest data point where $\alpha$ is a constant and $n$ is the sample size. This means it can tolerate a constant fraction of high-noise points. These are the first such results under our general conditions with computationally efficient estimators. It also justifies the wide application and empirical success of iterative trimming in practice. Our theoretical results are complemented by experiments on synthetic data.
APA
Yuan, H. & Liang, Y.. (2020). Learning Entangled Single-Sample Distributions via Iterative Trimming. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:2666-2676 Available from http://proceedings.mlr.press/v108/yuan20a.html .

Related Material