Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2712-2721, 2016.
Abstract
In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data.
@InProceedings{pmlr-v48-guha16,
title = {Robust Random Cut Forest Based Anomaly Detection on Streams},
author = {Sudipto Guha and Nina Mishra and Gourav Roy and Okke Schrijvers},
booktitle = {Proceedings of The 33rd International Conference on Machine Learning},
pages = {2712--2721},
year = {2016},
editor = {Maria Florina Balcan and Kilian Q. Weinberger},
volume = {48},
series = {Proceedings of Machine Learning Research},
address = {New York, New York, USA},
month = {20--22 Jun},
publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v48/guha16.pdf},
url = {http://proceedings.mlr.press/v48/guha16.html},
abstract = {In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data.}
}
%0 Conference Paper
%T Robust Random Cut Forest Based Anomaly Detection on Streams
%A Sudipto Guha
%A Nina Mishra
%A Gourav Roy
%A Okke Schrijvers
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger
%F pmlr-v48-guha16
%I PMLR
%J Proceedings of Machine Learning Research
%P 2712--2721
%U http://proceedings.mlr.press
%V 48
%W PMLR
%X In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data.
TY - CPAPER
TI - Robust Random Cut Forest Based Anomaly Detection on Streams
AU - Sudipto Guha
AU - Nina Mishra
AU - Gourav Roy
AU - Okke Schrijvers
BT - Proceedings of The 33rd International Conference on Machine Learning
PY - 2016/06/11
DA - 2016/06/11
ED - Maria Florina Balcan
ED - Kilian Q. Weinberger
ID - pmlr-v48-guha16
PB - PMLR
SP - 2712
DP - PMLR
EP - 2721
L1 - http://proceedings.mlr.press/v48/guha16.pdf
UR - http://proceedings.mlr.press/v48/guha16.html
AB - In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data.
ER -
Guha, S., Mishra, N., Roy, G. & Schrijvers, O.. (2016). Robust Random Cut Forest Based Anomaly Detection on Streams. Proceedings of The 33rd International Conference on Machine Learning, in PMLR 48:2712-2721
This site last compiled Sat, 04 Nov 2017 20:59:32 +0000