The Information Sieve

Greg Ver Steeg; Aram Galstyan

The Information Sieve

Greg Ver Steeg, Aram Galstyan

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:164-172, 2016.

Abstract

We introduce a new framework for unsupervised learning of representations based on a novel hierarchical decomposition of information. Intuitively, data is passed through a series of progressively fine-grained sieves. Each layer of the sieve recovers a single latent factor that is maximally informative about multivariate dependence in the data. The data is transformed after each pass so that the remaining unexplained information trickles down to the next layer. Ultimately, we are left with a set of latent factors explaining all the dependence in the original data and remainder information consisting of independent noise. We present a practical implementation of this framework for discrete variables and apply it to a variety of fundamental tasks in unsupervised learning including independent component analysis, lossy and lossless compression, and predicting missing values in data.

Cite this Paper

BibTeX

@InProceedings{pmlr-v48-steeg16,
  title = 	 {The Information Sieve},
  author = 	 {Steeg, Greg Ver and Galstyan, Aram},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {164--172},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/steeg16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/steeg16.html},
  abstract = 	 {We introduce a new framework for unsupervised learning of representations based on a novel hierarchical decomposition of information. Intuitively, data is passed through a series of progressively fine-grained sieves. Each layer of the sieve recovers a single latent factor that is maximally informative about multivariate dependence in the data. The data is transformed after each pass so that the remaining unexplained information trickles down to the next layer. Ultimately, we are left with a set of latent factors explaining all the dependence in the original data and remainder information consisting of independent noise. We present a practical implementation of this framework for discrete variables and apply it to a variety of fundamental tasks in unsupervised learning including independent component analysis, lossy and lossless compression, and predicting missing values in data.}
}

Endnote

%0 Conference Paper
%T The Information Sieve
%A Greg Ver Steeg
%A Aram Galstyan
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-steeg16
%I PMLR
%P 164--172
%U https://proceedings.mlr.press/v48/steeg16.html
%V 48
%X We introduce a new framework for unsupervised learning of representations based on a novel hierarchical decomposition of information. Intuitively, data is passed through a series of progressively fine-grained sieves. Each layer of the sieve recovers a single latent factor that is maximally informative about multivariate dependence in the data. The data is transformed after each pass so that the remaining unexplained information trickles down to the next layer. Ultimately, we are left with a set of latent factors explaining all the dependence in the original data and remainder information consisting of independent noise. We present a practical implementation of this framework for discrete variables and apply it to a variety of fundamental tasks in unsupervised learning including independent component analysis, lossy and lossless compression, and predicting missing values in data.

RIS

TY  - CPAPER
TI  - The Information Sieve
AU  - Greg Ver Steeg
AU  - Aram Galstyan
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-steeg16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 164
EP  - 172
L1  - http://proceedings.mlr.press/v48/steeg16.pdf
UR  - https://proceedings.mlr.press/v48/steeg16.html
AB  - We introduce a new framework for unsupervised learning of representations based on a novel hierarchical decomposition of information. Intuitively, data is passed through a series of progressively fine-grained sieves. Each layer of the sieve recovers a single latent factor that is maximally informative about multivariate dependence in the data. The data is transformed after each pass so that the remaining unexplained information trickles down to the next layer. Ultimately, we are left with a set of latent factors explaining all the dependence in the original data and remainder information consisting of independent noise. We present a practical implementation of this framework for discrete variables and apply it to a variety of fundamental tasks in unsupervised learning including independent component analysis, lossy and lossless compression, and predicting missing values in data.
ER  -

APA

Steeg, G.V. & Galstyan, A.. (2016). The Information Sieve. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:164-172 Available from https://proceedings.mlr.press/v48/steeg16.html.

The Information Sieve

Abstract

Cite this Paper

Related Material