Robust Random Cut Forest Based Anomaly Detection on Streams

Sudipto Guha, Nina Mishra, Gourav Roy, Okke Schrijvers
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2712-2721, 2016.

Abstract

In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-guha16, title = {Robust Random Cut Forest Based Anomaly Detection on Streams}, author = {Guha, Sudipto and Mishra, Nina and Roy, Gourav and Schrijvers, Okke}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {2712--2721}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/guha16.pdf}, url = {http://proceedings.mlr.press/v48/guha16.html}, abstract = {In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data.} }
Endnote
%0 Conference Paper %T Robust Random Cut Forest Based Anomaly Detection on Streams %A Sudipto Guha %A Nina Mishra %A Gourav Roy %A Okke Schrijvers %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-guha16 %I PMLR %P 2712--2721 %U http://proceedings.mlr.press/v48/guha16.html %V 48 %X In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data.
RIS
TY - CPAPER TI - Robust Random Cut Forest Based Anomaly Detection on Streams AU - Sudipto Guha AU - Nina Mishra AU - Gourav Roy AU - Okke Schrijvers BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-guha16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 2712 EP - 2721 L1 - http://proceedings.mlr.press/v48/guha16.pdf UR - http://proceedings.mlr.press/v48/guha16.html AB - In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data. ER -
APA
Guha, S., Mishra, N., Roy, G. & Schrijvers, O.. (2016). Robust Random Cut Forest Based Anomaly Detection on Streams. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:2712-2721 Available from http://proceedings.mlr.press/v48/guha16.html.

Related Material