Online Incremental Feature Learning with Denoising Autoencoders

Guanyu Zhou; Kihyuk Sohn; Honglak Lee

Online Incremental Feature Learning with Denoising Autoencoders

Guanyu Zhou, Kihyuk Sohn, Honglak Lee

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1453-1461, 2012.

Abstract

While determining model complexity is an important problem in machine learning, many feature learning algorithms rely on cross-validation to choose an optimal number of features, which is usually infeasible for online learning from a massive stream of data. In this paper, we propose an incremental feature learning algorithm to determine the optimal model complexity for large-scale, online datasets based on the denoising autoencoder. This algorithm is composed of two processes: adding features and merging features. Specifically, it adds new features to minimize the objective function’s residual and merges similar features to obtain a compact feature representation and prevent over-fitting. Our experiments show that the model quickly converges to the optimal number of features in a large-scale online setting, and outperforms the (non-incremental) denoising autoencoder, as well as deep belief networks and stacked denoising autoencoders for classification tasks. Further, the algorithm is particularly effective in recognizing new patterns when the data distribution changes over time in the massive online data stream.

Cite this Paper

BibTeX


@InProceedings{pmlr-v22-zhou12b,
  title = 	 {Online Incremental Feature Learning with Denoising Autoencoders},
  author = 	 {Zhou, Guanyu and Sohn, Kihyuk and Lee, Honglak},
  booktitle = 	 {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1453--1461},
  year = 	 {2012},
  editor = 	 {Lawrence, Neil D. and Girolami, Mark},
  volume = 	 {22},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {La Palma, Canary Islands},
  month = 	 {21--23 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v22/zhou12b/zhou12b.pdf},
  url = 	 {https://proceedings.mlr.press/v22/zhou12b.html},
  abstract = 	 {While determining model complexity is an important problem in machine learning, many feature learning algorithms rely on cross-validation to choose an optimal number of features, which is usually infeasible for online learning from a massive stream of data. In this paper, we propose an incremental feature learning algorithm to determine the optimal model complexity for large-scale, online datasets based on the denoising autoencoder. This algorithm is composed of two processes: adding features and merging features. Specifically, it adds new features to minimize the objective function’s residual and merges similar features to obtain a compact feature representation and prevent over-fitting. Our experiments show that the model quickly converges to the optimal number of features in a large-scale online setting, and outperforms the (non-incremental) denoising autoencoder, as well as deep belief networks and stacked denoising autoencoders for classification tasks. Further, the algorithm is particularly effective in recognizing new patterns when the data distribution changes over time in the massive online data stream.}
}

Endnote

%0 Conference Paper
%T Online Incremental Feature Learning with Denoising Autoencoders
%A Guanyu Zhou
%A Kihyuk Sohn
%A Honglak Lee
%B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2012
%E Neil D. Lawrence
%E Mark Girolami	
%F pmlr-v22-zhou12b
%I PMLR
%P 1453--1461
%U https://proceedings.mlr.press/v22/zhou12b.html
%V 22
%X While determining model complexity is an important problem in machine learning, many feature learning algorithms rely on cross-validation to choose an optimal number of features, which is usually infeasible for online learning from a massive stream of data. In this paper, we propose an incremental feature learning algorithm to determine the optimal model complexity for large-scale, online datasets based on the denoising autoencoder. This algorithm is composed of two processes: adding features and merging features. Specifically, it adds new features to minimize the objective function’s residual and merges similar features to obtain a compact feature representation and prevent over-fitting. Our experiments show that the model quickly converges to the optimal number of features in a large-scale online setting, and outperforms the (non-incremental) denoising autoencoder, as well as deep belief networks and stacked denoising autoencoders for classification tasks. Further, the algorithm is particularly effective in recognizing new patterns when the data distribution changes over time in the massive online data stream.

RIS


TY  - CPAPER
TI  - Online Incremental Feature Learning with Denoising Autoencoders
AU  - Guanyu Zhou
AU  - Kihyuk Sohn
AU  - Honglak Lee
BT  - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
DA  - 2012/03/21
ED  - Neil D. Lawrence
ED  - Mark Girolami	
ID  - pmlr-v22-zhou12b
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 22
SP  - 1453
EP  - 1461
L1  - http://proceedings.mlr.press/v22/zhou12b/zhou12b.pdf
UR  - https://proceedings.mlr.press/v22/zhou12b.html
AB  - While determining model complexity is an important problem in machine learning, many feature learning algorithms rely on cross-validation to choose an optimal number of features, which is usually infeasible for online learning from a massive stream of data. In this paper, we propose an incremental feature learning algorithm to determine the optimal model complexity for large-scale, online datasets based on the denoising autoencoder. This algorithm is composed of two processes: adding features and merging features. Specifically, it adds new features to minimize the objective function’s residual and merges similar features to obtain a compact feature representation and prevent over-fitting. Our experiments show that the model quickly converges to the optimal number of features in a large-scale online setting, and outperforms the (non-incremental) denoising autoencoder, as well as deep belief networks and stacked denoising autoencoders for classification tasks. Further, the algorithm is particularly effective in recognizing new patterns when the data distribution changes over time in the massive online data stream.
ER  -

APA


Zhou, G., Sohn, K. & Lee, H.. (2012). Online Incremental Feature Learning with Denoising Autoencoders. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1453-1461 Available from https://proceedings.mlr.press/v22/zhou12b.html.

Online Incremental Feature Learning with Denoising Autoencoders

Abstract

Cite this Paper

Related Material