Online Clustering of Processes

Azadeh Khaleghi, Daniil Ryabko, Jeremie Mary, Philippe Preux
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:601-609, 2012.

Abstract

The problem of online clustering is considered in the case where each data point is a sequence generated by a stationary ergodic process. Data arrive in an online fashion so that the sample received at every time-step is either a continuation of some previously received sequence or a new sequence. The dependence between the sequences can be arbitrary. No parametric or independence assumptions are made; the only assumption is that the marginal distribution of each sequence is stationary and ergodic. A novel, computationally efficient algorithm is proposed and is shown to be asymptotically consistent (under a natural notion of consistency). The performance of the proposed algorithm is evaluated on simulated data, as well as on real datasets (motion classification).

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-khaleghi12, title = {Online Clustering of Processes}, author = {Khaleghi, Azadeh and Ryabko, Daniil and Mary, Jeremie and Preux, Philippe}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {601--609}, year = {2012}, editor = {Lawrence, Neil D. and Girolami, Mark}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/khaleghi12/khaleghi12.pdf}, url = {https://proceedings.mlr.press/v22/khaleghi12.html}, abstract = {The problem of online clustering is considered in the case where each data point is a sequence generated by a stationary ergodic process. Data arrive in an online fashion so that the sample received at every time-step is either a continuation of some previously received sequence or a new sequence. The dependence between the sequences can be arbitrary. No parametric or independence assumptions are made; the only assumption is that the marginal distribution of each sequence is stationary and ergodic. A novel, computationally efficient algorithm is proposed and is shown to be asymptotically consistent (under a natural notion of consistency). The performance of the proposed algorithm is evaluated on simulated data, as well as on real datasets (motion classification).} }
Endnote
%0 Conference Paper %T Online Clustering of Processes %A Azadeh Khaleghi %A Daniil Ryabko %A Jeremie Mary %A Philippe Preux %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-khaleghi12 %I PMLR %P 601--609 %U https://proceedings.mlr.press/v22/khaleghi12.html %V 22 %X The problem of online clustering is considered in the case where each data point is a sequence generated by a stationary ergodic process. Data arrive in an online fashion so that the sample received at every time-step is either a continuation of some previously received sequence or a new sequence. The dependence between the sequences can be arbitrary. No parametric or independence assumptions are made; the only assumption is that the marginal distribution of each sequence is stationary and ergodic. A novel, computationally efficient algorithm is proposed and is shown to be asymptotically consistent (under a natural notion of consistency). The performance of the proposed algorithm is evaluated on simulated data, as well as on real datasets (motion classification).
RIS
TY - CPAPER TI - Online Clustering of Processes AU - Azadeh Khaleghi AU - Daniil Ryabko AU - Jeremie Mary AU - Philippe Preux BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-khaleghi12 PB - PMLR DP - Proceedings of Machine Learning Research VL - 22 SP - 601 EP - 609 L1 - http://proceedings.mlr.press/v22/khaleghi12/khaleghi12.pdf UR - https://proceedings.mlr.press/v22/khaleghi12.html AB - The problem of online clustering is considered in the case where each data point is a sequence generated by a stationary ergodic process. Data arrive in an online fashion so that the sample received at every time-step is either a continuation of some previously received sequence or a new sequence. The dependence between the sequences can be arbitrary. No parametric or independence assumptions are made; the only assumption is that the marginal distribution of each sequence is stationary and ergodic. A novel, computationally efficient algorithm is proposed and is shown to be asymptotically consistent (under a natural notion of consistency). The performance of the proposed algorithm is evaluated on simulated data, as well as on real datasets (motion classification). ER -
APA
Khaleghi, A., Ryabko, D., Mary, J. & Preux, P.. (2012). Online Clustering of Processes. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:601-609 Available from https://proceedings.mlr.press/v22/khaleghi12.html.

Related Material