Learning to Locate Relative Outliers

Shukai Li, Ivor W. Tsang
Proceedings of the Asian Conference on Machine Learning, PMLR 20:47-62, 2011.

Abstract

Outliers usually spread across regions of low density. However, due to the absence or scarcity of outliers, designing a robust detector to sift outliers from a given dataset is still very challenging. In this paper, we consider to identify relative outliers from the target dataset with respect to another reference dataset of normal data. Particularly, we employ Maximum Mean Discrepancy (MMD) for matching the distribution between these two datasets and present a novel learning framework to learn a relative outlier detector. The learning task is formulated as a Mixed Integer Programming (MIP) problem, which is computationally hard. To this end, we propose an effective procedure to find a largely violated labeling vector for identifying relative outliers from abundant normal patterns, and its convergence is also presented. Then, a set of largely violated labeling vectors are combined by multiple kernel learning methods to robustly locate relative outliers. Comprehensive empirical studies on real-world datasets verify that our proposed relative outlier detection outperforms existing methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v20-li11, title = {Learning to Locate Relative Outliers}, author = {Li, Shukai and Tsang, Ivor W.}, booktitle = {Proceedings of the Asian Conference on Machine Learning}, pages = {47--62}, year = {2011}, editor = {Hsu, Chun-Nan and Lee, Wee Sun}, volume = {20}, series = {Proceedings of Machine Learning Research}, address = {South Garden Hotels and Resorts, Taoyuan, Taiwain}, month = {14--15 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v20/li11/li11.pdf}, url = {https://proceedings.mlr.press/v20/li11.html}, abstract = {Outliers usually spread across regions of low density. However, due to the absence or scarcity of outliers, designing a robust detector to sift outliers from a given dataset is still very challenging. In this paper, we consider to identify relative outliers from the target dataset with respect to another reference dataset of normal data. Particularly, we employ Maximum Mean Discrepancy (MMD) for matching the distribution between these two datasets and present a novel learning framework to learn a relative outlier detector. The learning task is formulated as a Mixed Integer Programming (MIP) problem, which is computationally hard. To this end, we propose an effective procedure to find a largely violated labeling vector for identifying relative outliers from abundant normal patterns, and its convergence is also presented. Then, a set of largely violated labeling vectors are combined by multiple kernel learning methods to robustly locate relative outliers. Comprehensive empirical studies on real-world datasets verify that our proposed relative outlier detection outperforms existing methods.} }
Endnote
%0 Conference Paper %T Learning to Locate Relative Outliers %A Shukai Li %A Ivor W. Tsang %B Proceedings of the Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2011 %E Chun-Nan Hsu %E Wee Sun Lee %F pmlr-v20-li11 %I PMLR %P 47--62 %U https://proceedings.mlr.press/v20/li11.html %V 20 %X Outliers usually spread across regions of low density. However, due to the absence or scarcity of outliers, designing a robust detector to sift outliers from a given dataset is still very challenging. In this paper, we consider to identify relative outliers from the target dataset with respect to another reference dataset of normal data. Particularly, we employ Maximum Mean Discrepancy (MMD) for matching the distribution between these two datasets and present a novel learning framework to learn a relative outlier detector. The learning task is formulated as a Mixed Integer Programming (MIP) problem, which is computationally hard. To this end, we propose an effective procedure to find a largely violated labeling vector for identifying relative outliers from abundant normal patterns, and its convergence is also presented. Then, a set of largely violated labeling vectors are combined by multiple kernel learning methods to robustly locate relative outliers. Comprehensive empirical studies on real-world datasets verify that our proposed relative outlier detection outperforms existing methods.
RIS
TY - CPAPER TI - Learning to Locate Relative Outliers AU - Shukai Li AU - Ivor W. Tsang BT - Proceedings of the Asian Conference on Machine Learning DA - 2011/11/17 ED - Chun-Nan Hsu ED - Wee Sun Lee ID - pmlr-v20-li11 PB - PMLR DP - Proceedings of Machine Learning Research VL - 20 SP - 47 EP - 62 L1 - http://proceedings.mlr.press/v20/li11/li11.pdf UR - https://proceedings.mlr.press/v20/li11.html AB - Outliers usually spread across regions of low density. However, due to the absence or scarcity of outliers, designing a robust detector to sift outliers from a given dataset is still very challenging. In this paper, we consider to identify relative outliers from the target dataset with respect to another reference dataset of normal data. Particularly, we employ Maximum Mean Discrepancy (MMD) for matching the distribution between these two datasets and present a novel learning framework to learn a relative outlier detector. The learning task is formulated as a Mixed Integer Programming (MIP) problem, which is computationally hard. To this end, we propose an effective procedure to find a largely violated labeling vector for identifying relative outliers from abundant normal patterns, and its convergence is also presented. Then, a set of largely violated labeling vectors are combined by multiple kernel learning methods to robustly locate relative outliers. Comprehensive empirical studies on real-world datasets verify that our proposed relative outlier detection outperforms existing methods. ER -
APA
Li, S. & Tsang, I.W.. (2011). Learning to Locate Relative Outliers. Proceedings of the Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 20:47-62 Available from https://proceedings.mlr.press/v20/li11.html.

Related Material