Outlier Detection and Robust Estimation in Nonparametric Regression

Dehan Kong, Howard Bondell, Weining Shen
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:208-216, 2018.

Abstract

This paper studies outlier detection and robust estimation for nonparametric regression problems. We propose to include a subject-specific mean shift parameter for each data point such that a nonzero parameter will identify its corresponding data point as an outlier. We adopt a regularization approach by imposing a roughness penalty on the regression function and a shrinkage penalty on the mean shift parameter. An efficient algorithm has been proposed to solve the double penalized regression problem. We discuss a data-driven simultaneous choice of two regularization parameters based on a combination of generalized cross validation and modified Bayesian information criterion. We show that the proposed method can consistently detect the outliers. In addition, we obtain minimax-optimal convergence rates for both the regression function and the mean shift parameter under regularity conditions. The estimation procedure is shown to enjoy the oracle property in the sense that the convergence rates agree with the minimax-optimal rates when the outliers (or regression function) are known in advance. Numerical results demonstrate that the proposed method has desired performance in identifying outliers under different scenarios.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-kong18a, title = {Outlier Detection and Robust Estimation in Nonparametric Regression}, author = {Kong, Dehan and Bondell, Howard and Shen, Weining}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {208--216}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/kong18a/kong18a.pdf}, url = {https://proceedings.mlr.press/v84/kong18a.html}, abstract = {This paper studies outlier detection and robust estimation for nonparametric regression problems. We propose to include a subject-specific mean shift parameter for each data point such that a nonzero parameter will identify its corresponding data point as an outlier. We adopt a regularization approach by imposing a roughness penalty on the regression function and a shrinkage penalty on the mean shift parameter. An efficient algorithm has been proposed to solve the double penalized regression problem. We discuss a data-driven simultaneous choice of two regularization parameters based on a combination of generalized cross validation and modified Bayesian information criterion. We show that the proposed method can consistently detect the outliers. In addition, we obtain minimax-optimal convergence rates for both the regression function and the mean shift parameter under regularity conditions. The estimation procedure is shown to enjoy the oracle property in the sense that the convergence rates agree with the minimax-optimal rates when the outliers (or regression function) are known in advance. Numerical results demonstrate that the proposed method has desired performance in identifying outliers under different scenarios.} }
Endnote
%0 Conference Paper %T Outlier Detection and Robust Estimation in Nonparametric Regression %A Dehan Kong %A Howard Bondell %A Weining Shen %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-kong18a %I PMLR %P 208--216 %U https://proceedings.mlr.press/v84/kong18a.html %V 84 %X This paper studies outlier detection and robust estimation for nonparametric regression problems. We propose to include a subject-specific mean shift parameter for each data point such that a nonzero parameter will identify its corresponding data point as an outlier. We adopt a regularization approach by imposing a roughness penalty on the regression function and a shrinkage penalty on the mean shift parameter. An efficient algorithm has been proposed to solve the double penalized regression problem. We discuss a data-driven simultaneous choice of two regularization parameters based on a combination of generalized cross validation and modified Bayesian information criterion. We show that the proposed method can consistently detect the outliers. In addition, we obtain minimax-optimal convergence rates for both the regression function and the mean shift parameter under regularity conditions. The estimation procedure is shown to enjoy the oracle property in the sense that the convergence rates agree with the minimax-optimal rates when the outliers (or regression function) are known in advance. Numerical results demonstrate that the proposed method has desired performance in identifying outliers under different scenarios.
APA
Kong, D., Bondell, H. & Shen, W.. (2018). Outlier Detection and Robust Estimation in Nonparametric Regression. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:208-216 Available from https://proceedings.mlr.press/v84/kong18a.html.

Related Material