Learning Efficient Anomaly Detectors from K-NN Graphs

Jonathan Root; Jing Qian; Venkatesh Saligrama

Learning Efficient Anomaly Detectors from K-NN Graphs

Jonathan Root, Jing Qian, Venkatesh Saligrama

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:790-799, 2015.

Abstract

We propose a non-parametric anomaly detection algorithm for high dimensional data. We score each datapoint by its average K-NN distance, and rank them accordingly. We then train limited complexity models to imitate these scores based on the max-margin learning-to-rank framework. A test-point is declared as an anomaly at α-false alarm level if the predicted score is in the α-percentile. The resulting anomaly detector is shown to be asymptotically optimal in that for any false alarm rate α, its decision region converges to the α-percentile minimum volume level set of the unknown underlying density. In addition, we test both the statistical performance and computational efficiency of our algorithm on a number of synthetic and real-data experiments. Our results demonstrate the superiority of our algorithm over existing K-NN based anomaly detection algorithms, with significant computational savings.

Cite this Paper

BibTeX


@InProceedings{pmlr-v38-root15,
  title = 	 {{Learning Efficient Anomaly Detectors from K-NN Graphs}},
  author = 	 {Root, Jonathan and Qian, Jing and Saligrama, Venkatesh},
  booktitle = 	 {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {790--799},
  year = 	 {2015},
  editor = 	 {Lebanon, Guy and Vishwanathan, S. V. N.},
  volume = 	 {38},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {San Diego, California, USA},
  month = 	 {09--12 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v38/root15.pdf},
  url = 	 {https://proceedings.mlr.press/v38/root15.html},
  abstract = 	 {We propose a non-parametric anomaly detection algorithm for high dimensional data. We score each datapoint by its average K-NN distance, and rank them  accordingly. We then train limited complexity models to imitate these scores based on the max-margin learning-to-rank framework. A test-point is declared as an anomaly at α-false  alarm level if the predicted score is in the α-percentile. The resulting anomaly detector is shown to be asymptotically optimal in that for any false alarm rate α, its decision region converges to the α-percentile minimum volume level set of the unknown underlying density. In addition, we test both the statistical performance and computational efficiency of our algorithm on a number of synthetic and real-data experiments. Our results demonstrate the superiority of our algorithm over existing K-NN based anomaly detection algorithms, with significant computational savings.}
}

Endnote

%0 Conference Paper
%T Learning Efficient Anomaly Detectors from K-NN Graphs
%A Jonathan Root
%A Jing Qian
%A Venkatesh Saligrama
%B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2015
%E Guy Lebanon
%E S. V. N. Vishwanathan	
%F pmlr-v38-root15
%I PMLR
%P 790--799
%U https://proceedings.mlr.press/v38/root15.html
%V 38
%X We propose a non-parametric anomaly detection algorithm for high dimensional data. We score each datapoint by its average K-NN distance, and rank them  accordingly. We then train limited complexity models to imitate these scores based on the max-margin learning-to-rank framework. A test-point is declared as an anomaly at α-false  alarm level if the predicted score is in the α-percentile. The resulting anomaly detector is shown to be asymptotically optimal in that for any false alarm rate α, its decision region converges to the α-percentile minimum volume level set of the unknown underlying density. In addition, we test both the statistical performance and computational efficiency of our algorithm on a number of synthetic and real-data experiments. Our results demonstrate the superiority of our algorithm over existing K-NN based anomaly detection algorithms, with significant computational savings.

RIS


TY  - CPAPER
TI  - Learning Efficient Anomaly Detectors from K-NN Graphs
AU  - Jonathan Root
AU  - Jing Qian
AU  - Venkatesh Saligrama
BT  - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics
DA  - 2015/02/21
ED  - Guy Lebanon
ED  - S. V. N. Vishwanathan	
ID  - pmlr-v38-root15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 38
SP  - 790
EP  - 799
L1  - http://proceedings.mlr.press/v38/root15.pdf
UR  - https://proceedings.mlr.press/v38/root15.html
AB  - We propose a non-parametric anomaly detection algorithm for high dimensional data. We score each datapoint by its average K-NN distance, and rank them  accordingly. We then train limited complexity models to imitate these scores based on the max-margin learning-to-rank framework. A test-point is declared as an anomaly at α-false  alarm level if the predicted score is in the α-percentile. The resulting anomaly detector is shown to be asymptotically optimal in that for any false alarm rate α, its decision region converges to the α-percentile minimum volume level set of the unknown underlying density. In addition, we test both the statistical performance and computational efficiency of our algorithm on a number of synthetic and real-data experiments. Our results demonstrate the superiority of our algorithm over existing K-NN based anomaly detection algorithms, with significant computational savings.
ER  -

APA


Root, J., Qian, J. & Saligrama, V.. (2015). Learning Efficient Anomaly Detectors from K-NN Graphs. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:790-799 Available from https://proceedings.mlr.press/v38/root15.html.

Learning Efficient Anomaly Detectors from K-NN Graphs

Abstract

Cite this Paper

Related Material