Streaming Inference for Infinite Non-Stationary Clustering

Rylan Schaeffer, Gabrielle Kaili-may Liu, Yilun Du, Scott Linderman, Ila R. Fiete
Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199:310-326, 2022.

Abstract

Learning from a continuous stream of non-stationary data in an unsupervised manner is arguably one of the most common and most challenging settings facing intelligent agents. Here, we attack learning under all three conditions (unsupervised, streaming, non-stationary) in the context of clustering, also known as mixture modeling. We introduce a novel clustering algorithm that endows mixture models with the ability to create new clusters online, as demanded by the data, in a probabilistic, time-varying, and principled manner. To achieve this, we first define a novel stochastic process called the Dynamical Chinese Restaurant Process (Dynamical CRP), which is a non-exchangeable distribution over partitions of a set; next, we show that the Dynamical CRP provides a non-stationary prior over cluster assignments and yields an efficient streaming variational inference algorithm. We conclude with experiments showing that the Dynamical CRP can be applied on diverse synthetic and real data with Gaussian and non-Gaussian likelihoods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v199-schaeffer22a, title = {Streaming Inference for Infinite Non-Stationary Clustering}, author = {Schaeffer, Rylan and Liu, Gabrielle Kaili-may and Du, Yilun and Linderman, Scott and Fiete, Ila R.}, booktitle = {Proceedings of The 1st Conference on Lifelong Learning Agents}, pages = {310--326}, year = {2022}, editor = {Chandar, Sarath and Pascanu, Razvan and Precup, Doina}, volume = {199}, series = {Proceedings of Machine Learning Research}, month = {22--24 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v199/schaeffer22a/schaeffer22a.pdf}, url = {https://proceedings.mlr.press/v199/schaeffer22a.html}, abstract = {Learning from a continuous stream of non-stationary data in an unsupervised manner is arguably one of the most common and most challenging settings facing intelligent agents. Here, we attack learning under all three conditions (unsupervised, streaming, non-stationary) in the context of clustering, also known as mixture modeling. We introduce a novel clustering algorithm that endows mixture models with the ability to create new clusters online, as demanded by the data, in a probabilistic, time-varying, and principled manner. To achieve this, we first define a novel stochastic process called the Dynamical Chinese Restaurant Process (Dynamical CRP), which is a non-exchangeable distribution over partitions of a set; next, we show that the Dynamical CRP provides a non-stationary prior over cluster assignments and yields an efficient streaming variational inference algorithm. We conclude with experiments showing that the Dynamical CRP can be applied on diverse synthetic and real data with Gaussian and non-Gaussian likelihoods.} }
Endnote
%0 Conference Paper %T Streaming Inference for Infinite Non-Stationary Clustering %A Rylan Schaeffer %A Gabrielle Kaili-may Liu %A Yilun Du %A Scott Linderman %A Ila R. Fiete %B Proceedings of The 1st Conference on Lifelong Learning Agents %C Proceedings of Machine Learning Research %D 2022 %E Sarath Chandar %E Razvan Pascanu %E Doina Precup %F pmlr-v199-schaeffer22a %I PMLR %P 310--326 %U https://proceedings.mlr.press/v199/schaeffer22a.html %V 199 %X Learning from a continuous stream of non-stationary data in an unsupervised manner is arguably one of the most common and most challenging settings facing intelligent agents. Here, we attack learning under all three conditions (unsupervised, streaming, non-stationary) in the context of clustering, also known as mixture modeling. We introduce a novel clustering algorithm that endows mixture models with the ability to create new clusters online, as demanded by the data, in a probabilistic, time-varying, and principled manner. To achieve this, we first define a novel stochastic process called the Dynamical Chinese Restaurant Process (Dynamical CRP), which is a non-exchangeable distribution over partitions of a set; next, we show that the Dynamical CRP provides a non-stationary prior over cluster assignments and yields an efficient streaming variational inference algorithm. We conclude with experiments showing that the Dynamical CRP can be applied on diverse synthetic and real data with Gaussian and non-Gaussian likelihoods.
APA
Schaeffer, R., Liu, G.K., Du, Y., Linderman, S. & Fiete, I.R.. (2022). Streaming Inference for Infinite Non-Stationary Clustering. Proceedings of The 1st Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 199:310-326 Available from https://proceedings.mlr.press/v199/schaeffer22a.html.

Related Material