Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach

Bin Li, Julia Yu, Jie Zhang, Bin Ke
; Asian Conference on Machine Learning, PMLR 45:173-188, 2016.

Abstract

This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v45-Li15, title = {Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach}, author = {Bin Li and Julia Yu and Jie Zhang and Bin Ke}, booktitle = {Asian Conference on Machine Learning}, pages = {173--188}, year = {2016}, editor = {Geoffrey Holmes and Tie-Yan Liu}, volume = {45}, series = {Proceedings of Machine Learning Research}, address = {Hong Kong}, month = {20--22 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v45/Li15.pdf}, url = {http://proceedings.mlr.press/v45/Li15.html}, abstract = {This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment.} }
Endnote
%0 Conference Paper %T Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach %A Bin Li %A Julia Yu %A Jie Zhang %A Bin Ke %B Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Geoffrey Holmes %E Tie-Yan Liu %F pmlr-v45-Li15 %I PMLR %J Proceedings of Machine Learning Research %P 173--188 %U http://proceedings.mlr.press %V 45 %W PMLR %X This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment.
RIS
TY - CPAPER TI - Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach AU - Bin Li AU - Julia Yu AU - Jie Zhang AU - Bin Ke BT - Asian Conference on Machine Learning PY - 2016/02/25 DA - 2016/02/25 ED - Geoffrey Holmes ED - Tie-Yan Liu ID - pmlr-v45-Li15 PB - PMLR SP - 173 DP - PMLR EP - 188 L1 - http://proceedings.mlr.press/v45/Li15.pdf UR - http://proceedings.mlr.press/v45/Li15.html AB - This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment. ER -
APA
Li, B., Yu, J., Zhang, J. & Ke, B.. (2016). Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach. Asian Conference on Machine Learning, in PMLR 45:173-188

Related Material