[edit]
Robust Gaussian process regression with the trimmed marginal likelihood
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:67-76, 2023.
Abstract
Accurate outlier detection is not only a necessary preprocessing step, but can itself give important insights into the data. However, especially, for non-linear regression the detection of outliers is non-trivial, and actually ambiguous. We propose a new method that identifies outliers by finding a subset of data points T such that the marginal likelihood of all remaining data points S is maximized. Though the idea is more general, it is particular appealing for Gaussian processes regression, where the marginal likelihood has an analytic solution. While maximizing the marginal likelihood for hyper-parameter optimization is a well established non-convex optimization problem, optimizing the set of data points S is not. Indeed, even a greedy approximation is computationally challenging due to the high cost of evaluating the marginal likelihood. As a remedy, we propose an efficient projected gradient descent method with provable convergence guarantees. Moreover, we also establish the breakdown point when jointly optimizing hyper-parameters and S. For various datasets and types of outliers, our experiments demonstrate that the proposed method can improve outlier detection and robustness when compared with several popular alternatives like the student-t likelihood.