An Automated System for Data Attribute Anomaly Detection

David Love, Nalin Aggarwal, Alexander Statnikov, Chao Yuan
Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance, PMLR 71:95-101, 2018.

Abstract

We introduce DataQC, an automated system for data attribute anomaly detection for the purpose of improving data quality. Large organizations can have non-standardized or inconsistent data quality checking practices being followed across different departments. The key motivation behind the development of such a system is to 1) achieve a standard for anomaly detection 2) facilitate quick identification of obvious anomalies 3) reduce human judgment in data anomaly detection 4) facilitate prompt corrective action by data scientists. Most of the methods and techniques used during the development of this system are well known and have been widely used by finance professionals who deal with data. Our contribution is to provide a system that improves overall effciency, interpretability, and objectivity for detecting data attribute anomalies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v71-love18a, title = {An Automated System for Data Attribute Anomaly Detection}, author = {Love, David and Aggarwal, Nalin and Statnikov, Alexander and Yuan, Chao}, booktitle = {Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance}, pages = {95--101}, year = {2018}, editor = {Anandakrishnan, Archana and Kumar, Senthil and Statnikov, Alexander and Faruquie, Tanveer and Xu, Di}, volume = {71}, series = {Proceedings of Machine Learning Research}, month = {14 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v71/love18a/love18a.pdf}, url = {https://proceedings.mlr.press/v71/love18a.html}, abstract = {We introduce DataQC, an automated system for data attribute anomaly detection for the purpose of improving data quality. Large organizations can have non-standardized or inconsistent data quality checking practices being followed across different departments. The key motivation behind the development of such a system is to 1) achieve a standard for anomaly detection 2) facilitate quick identification of obvious anomalies 3) reduce human judgment in data anomaly detection 4) facilitate prompt corrective action by data scientists. Most of the methods and techniques used during the development of this system are well known and have been widely used by finance professionals who deal with data. Our contribution is to provide a system that improves overall effciency, interpretability, and objectivity for detecting data attribute anomalies.} }
Endnote
%0 Conference Paper %T An Automated System for Data Attribute Anomaly Detection %A David Love %A Nalin Aggarwal %A Alexander Statnikov %A Chao Yuan %B Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance %C Proceedings of Machine Learning Research %D 2018 %E Archana Anandakrishnan %E Senthil Kumar %E Alexander Statnikov %E Tanveer Faruquie %E Di Xu %F pmlr-v71-love18a %I PMLR %P 95--101 %U https://proceedings.mlr.press/v71/love18a.html %V 71 %X We introduce DataQC, an automated system for data attribute anomaly detection for the purpose of improving data quality. Large organizations can have non-standardized or inconsistent data quality checking practices being followed across different departments. The key motivation behind the development of such a system is to 1) achieve a standard for anomaly detection 2) facilitate quick identification of obvious anomalies 3) reduce human judgment in data anomaly detection 4) facilitate prompt corrective action by data scientists. Most of the methods and techniques used during the development of this system are well known and have been widely used by finance professionals who deal with data. Our contribution is to provide a system that improves overall effciency, interpretability, and objectivity for detecting data attribute anomalies.
APA
Love, D., Aggarwal, N., Statnikov, A. & Yuan, C.. (2018). An Automated System for Data Attribute Anomaly Detection. Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance, in Proceedings of Machine Learning Research 71:95-101 Available from https://proceedings.mlr.press/v71/love18a.html.

Related Material