Online Clustering with Experts

Anna Choromanska; Claire Monteleoni

Online Clustering with Experts

Anna Choromanska, Claire Monteleoni

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:227-235, 2012.

Abstract

Approximating the k-means clustering objective with an online learning algorithm is an open problem. We introduce a family of online clustering algorithms by extending algorithms for online supervised learning, with access to expert predictors, to the unsupervised learning setting. Instead of computing prediction errors in order to re-weight the experts, the algorithms compute an approximation to the current value of the k-means objective obtained by each expert. When the experts are batch clustering algorithms with b-approximation guarantees with respect to the k-means objective (for example, the k-means++ or k-means# algorithms), applied to a sliding window of the data stream, our algorithms obtain approximation guarantees with respect to the k-means objective. The form of these online clustering approximation guarantees is novel, and extends an evaluation framework proposed by Dasgupta as an analog to regret. Notably, our approximation bounds are with respect to the optimal k-means cost on the entire data stream seen so far, even though the algorithm is online. Our algorithm’s empirical performance tracks that of the best clustering algorithm in its expert set.

Cite this Paper

BibTeX


@InProceedings{pmlr-v22-choromanska12,
  title = 	 {Online Clustering with Experts},
  author = 	 {Choromanska, Anna and Monteleoni, Claire},
  booktitle = 	 {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {227--235},
  year = 	 {2012},
  editor = 	 {Lawrence, Neil D. and Girolami, Mark},
  volume = 	 {22},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {La Palma, Canary Islands},
  month = 	 {21--23 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v22/choromanska12/choromanska12.pdf},
  url = 	 {https://proceedings.mlr.press/v22/choromanska12.html},
  abstract = 	 {Approximating the k-means clustering objective with an online learning algorithm is an open problem. We introduce a family of online clustering algorithms by extending algorithms for online supervised learning, with access to expert predictors, to the unsupervised learning setting. Instead of computing prediction errors in order to re-weight the experts, the algorithms compute an approximation to the current value of the k-means objective obtained by each expert.  When the experts are batch clustering algorithms with b-approximation guarantees with respect to the k-means objective (for example, the k-means++ or k-means# algorithms), applied to a sliding window of the data stream, our algorithms obtain approximation guarantees with respect to the k-means objective. The form of these online clustering approximation guarantees is novel, and extends an evaluation framework proposed by Dasgupta as an analog to regret. Notably, our approximation bounds are with respect to the optimal k-means cost on the entire data stream seen so far, even though the algorithm is online. Our algorithm’s empirical performance tracks that of the best clustering algorithm in its expert set.}
}

Endnote

%0 Conference Paper
%T Online Clustering with Experts
%A Anna Choromanska
%A Claire Monteleoni
%B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2012
%E Neil D. Lawrence
%E Mark Girolami	
%F pmlr-v22-choromanska12
%I PMLR
%P 227--235
%U https://proceedings.mlr.press/v22/choromanska12.html
%V 22
%X Approximating the k-means clustering objective with an online learning algorithm is an open problem. We introduce a family of online clustering algorithms by extending algorithms for online supervised learning, with access to expert predictors, to the unsupervised learning setting. Instead of computing prediction errors in order to re-weight the experts, the algorithms compute an approximation to the current value of the k-means objective obtained by each expert.  When the experts are batch clustering algorithms with b-approximation guarantees with respect to the k-means objective (for example, the k-means++ or k-means# algorithms), applied to a sliding window of the data stream, our algorithms obtain approximation guarantees with respect to the k-means objective. The form of these online clustering approximation guarantees is novel, and extends an evaluation framework proposed by Dasgupta as an analog to regret. Notably, our approximation bounds are with respect to the optimal k-means cost on the entire data stream seen so far, even though the algorithm is online. Our algorithm’s empirical performance tracks that of the best clustering algorithm in its expert set.

RIS


TY  - CPAPER
TI  - Online Clustering with Experts
AU  - Anna Choromanska
AU  - Claire Monteleoni
BT  - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
DA  - 2012/03/21
ED  - Neil D. Lawrence
ED  - Mark Girolami	
ID  - pmlr-v22-choromanska12
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 22
SP  - 227
EP  - 235
L1  - http://proceedings.mlr.press/v22/choromanska12/choromanska12.pdf
UR  - https://proceedings.mlr.press/v22/choromanska12.html
AB  - Approximating the k-means clustering objective with an online learning algorithm is an open problem. We introduce a family of online clustering algorithms by extending algorithms for online supervised learning, with access to expert predictors, to the unsupervised learning setting. Instead of computing prediction errors in order to re-weight the experts, the algorithms compute an approximation to the current value of the k-means objective obtained by each expert.  When the experts are batch clustering algorithms with b-approximation guarantees with respect to the k-means objective (for example, the k-means++ or k-means# algorithms), applied to a sliding window of the data stream, our algorithms obtain approximation guarantees with respect to the k-means objective. The form of these online clustering approximation guarantees is novel, and extends an evaluation framework proposed by Dasgupta as an analog to regret. Notably, our approximation bounds are with respect to the optimal k-means cost on the entire data stream seen so far, even though the algorithm is online. Our algorithm’s empirical performance tracks that of the best clustering algorithm in its expert set.
ER  -

APA


Choromanska, A. & Monteleoni, C.. (2012). Online Clustering with Experts. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:227-235 Available from https://proceedings.mlr.press/v22/choromanska12.html.

Online Clustering with Experts

Abstract

Cite this Paper

Related Material