Proceedings of Machine Learning Research

Proceedings of Machine Learning Research Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications on 11-13 June 2018 Published as Volume 91 by the Proceedings of Machine Learning Research on 07 June 2018. Volume Edited by: Alex Gammerman Vladimir Vovk Zhiyuan Luo Evgueni Smirnov Ralf Peeters Series Editors: Neil D. Lawrence Mark Reid https://proceedings.mlr.press/v91/ Tue, 12 Aug 2025 13:54:20 +0000 Tue, 12 Aug 2025 13:54:20 +0000 Jekyll v3.10.0 Conformal feature-selection wrappers for instance transfer In this paper we propose a new method of conformal feature-selection wrappers for instance transfer (CFSWIT). Given target and source data, the method optimally selects features and source data that are relevant for a classification model. The CFSWIT method is model-independent. It was tested experimentally for several types of classifiers. The experiments show that the CFSWIT method is capable of outperforming standard instance transfer methods. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/zhou18a.html https://proceedings.mlr.press/v91/zhou18a.html Interpolation error of Gaussian process regression for misspecified case An interpolation error is an integral of the squared error of a regression model over a domain of interest. We consider the interpolation error for the case of misspecified Gaussian process regression: a used covariance function differs from a true one. We derive the interpolation error for a grid design of experiments for an arbitrary covariance function. Then we consider particular types of covariance functions from theoretical and practical points of view. For $\textitMatern_1/2$ covariance function poor estimation of parameters only slightly affects the quality of interpolation. For the most common covariance functions including $\textitMatern_3/2$ and squared exponential covariance functions poor choose of parameters of covariance functions leads to a bad quality of interpolation. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/zaytsev18a.html https://proceedings.mlr.press/v91/zaytsev18a.html Conformal predictive decision making This note explains how conformal predictive distributions can be used for the purpose of decision-making. Namely, a major limitation of conformal predictive distributions is that, at this time, they are only applicable to regression problems, where the label is a real number; however, this does not prevent them from being used in a general problem of decision making. The resulting methodology of conformal predictive decision making is illustrated on a small benchmark data set. Our main theoretical observation is that there exists an asymptotically efficient predictive decision-making system which can be obtained by using our methodology (and therefore, satisfying the standard property of validity). Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/vovk18b.html https://proceedings.mlr.press/v91/vovk18b.html Cross-conformal predictive distributions Conformal predictive systems are a recent modification of conformal predictors that output, in regression problems, probability distributions for labels of test observations rather than set predictions. The extra information provided by conformal predictive systems may be useful, e.g., in decision making problems. Conformal predictive systems inherit the relative computational inefficiency of conformal predictors. In this paper we discuss two computationally efficient versions of conformal predictive systems, which we call split conformal predictive systems and cross-conformal predictive systems, and discuss their advantages and limitations. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/vovk18a.html https://proceedings.mlr.press/v91/vovk18a.html Transfer learning for the probabilistic classification vector machine Transfer learning is focused on the reuse of supervised learning models in a new context. Prominent applications can be found in robotics, image processing or web mining. In these fields, the learning scenarios are naturally changing but often remain related to each other motivating the reuse of existing supervised models. Current transfer learning methods are not well suited and used for sparse and interpretable models. Sparsity is very desirable if the methods have to be used in technically limited environments and interpretability is getting more critical due to privacy regulations. In this work, we show how transfer learning can be integrated into the sparse and interpretable probabilistic classification vector machine and it is compared with different standard benchmarks in the field. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/raab18a.html https://proceedings.mlr.press/v91/raab18a.html Inductive Venn-Abers predictive distribution Venn predictors are a distribution-free probabilistic prediction framework that transforms the output of a scoring classifier into a (multi-)probabilistic prediction that has calibration guarantees, with the only requirement of an i.i.d. assumption for calibration and test data. In this paper, we extend the framework from classification (where probabilities are predicted for a discrete number of labels) to regression (where labels form a continuum). We show how Venn Predictors can be applied on top of any regression method to obtain calibrated predictive distributions, without requiring assumptions beyond i.i.d. of calibration and test sets. This is contrasted with methods such as Bayesian Linear Regression, for which the calibration guarantee instead relies on specific probabilistic assumptions on the distribution of the data. The adaptation of Venn Machine to regression required a theoretical analysis of the transductive and inductive forms of the predictor. We identify potential consistency problems and provide solutions for them. Finally, to illustrate their advantages, we apply regression Venn Predictors to the medical problem of predicting the survival time after Percutaneous Coronary Intervention, a potentially risky procedure that improves blood flow to a patient’s heart. The predictive distributions obtained with this method allow a variety of interpretations that include probability of survival time exceeding a chosen threshold or the shortest survival time guaranteed with a given probability. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/nouretdinov18a.html https://proceedings.mlr.press/v91/nouretdinov18a.html Cover your cough: detection of respiratory events with confidence using a smartwatch Cough and sneeze are the most common means to spread respiratory diseases amongst humans. Existing approaches to detect coughing and sneezing events are either intrusive or do not provide any reliability measure. This paper offers a novel proposal to reliably and non-intrusively detect such events using a smartwatch as the underlying hardware, Conformal Prediction as the underlying software. We rigorously analysed the performances of our proposal with the Harvard ESC Environmental Sound dataset, and real coughing samples taken from a smartwatch in different ambient noises. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/nguyen18a.html https://proceedings.mlr.press/v91/nguyen18a.html Conformal stacked weather forecasting In this paper we propose to apply the stacking method to aggregating multi-output predictions from different weather-forecasting domains (websites). Depending on the aggregating procedure (non-conformal/conformal), the results can be bare multi-output predictions or multi-output prediction regions. The experiments show the applicability of the stacking method on real data related to eight weather-forecasting domains. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/neeven18a.html https://proceedings.mlr.press/v91/neeven18a.html Conformal prediction in manifold learning The paper presents a geometrically motivated view on conformal prediction applied to nonlinear multi-output regression tasks for obtaining valid measure of accuracy of Manifold Learning Regression algorithms. A considered regression task is to estimate an unknown smooth mapping $\mathbf{f}$ from $q$-dimensional inputs $\mathbf{x}\in \mathbf{X}$ to $m$-dimensional outputs $\mathbf{y} = \mathbf{f}(\mathbf{x})$ based on training dataset $\mathbf{Z}_{(n)}$ consisting of ``input-output' pairs $\{Z_i = (\mathbf{x}_i, \mathbf{y}_i = \mathbf{f}(\mathbf{x}_i))^{\mathrm{T}}, i = 1, 2, \ldots , n\}$. Manifold Learning Regression (MLR) algorithm solves this task using Manifold learning technique. At first, unknown $q$-dimensional Regression manifold $\mathbf{M}(\mathbf{f}) = \{(\mathbf{x}, \mathbf{f}(\mathbf{x}))^{\mathrm{T}}\in\mathbb{R}^{q+m}: \mathbf{x}\in \mathbf{X}\subset \mathbb{R}^{q} \}$, embedded in ambient $(q+m)$-dimensional space, is estimated from the training data $\mathbf{Z}_{(n)}$, sampled from this manifold. The constructed estimator $\mathbf{M}_{MLR}$, which is also $q$-dimensional manifold embedded in ambient space $\mathbb{R}^{q+m}$, is close to $\mathbf{M}$ in terms of Hausdorff distance. After that, an estimator $\mathbf{f}_{MLR}$ of the unknown function $\mathbf{f}$, mapping arbitrary input $\mathbf{x}\in \mathbf{X}$ to output $\mathbf{f}_{MLR}(\mathbf{x})$, is constructed as the solution to the equation $\mathbf{M}(\mathbf{f}_{MLR}) = \mathbf{M}_{MLR}$. Conformal prediction allows constructing a prediction region for an unknown output $\mathbf{y} = \mathbf{f}(\mathbf{x})$ at Out-of-Sample input point $\mathbf{x}$ for a given confidence level using given nonconformity measure, characterizing to which extent an example $Z = (\mathbf{x}, \mathbf{y})^{\mathrm{T}}$ is different from examples in the known dataset $\mathbf{Z}_{(n)}$. The paper proposes a new nonconformity measure based on MLR estimators using an analog of Bregman distance. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/kuleshov18a.html https://proceedings.mlr.press/v91/kuleshov18a.html Aggregating strategies for long-term forecasting The article is devoted to investigating an application of aggregating algorithms to the problem of the long-term forecasting. We examine the classic aggregating algorithms based on the exponential reweighing. For the general Vovk’s aggregating algorithm we provide its probabilistic interpretation and its generalization for the long-term forecasting. For the special basic case of Vovk’s algorithm we provide two its modifications for the long-term forecasting. The first one is theoretically close to an optimal algorithm and is based on replication of independent copies. It provides the time-independent regret bound with respect to the best expert in the pool. The second one is not optimal but is more practical (explicitly models dependencies in observations) and has $O(\sqrtT)$ regret bound, where $T$ is the length of the game. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/korotin18a.html https://proceedings.mlr.press/v91/korotin18a.html Venn predictors for well-calibrated probability estimation trees Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. The standard solution is to employ an additional step, transforming the outputs from a classifier into probability estimates. In this paper, Venn predictors are compared to Platt scaling and isotonic regression, for the purpose of producing well-calibrated probabilistic predictions from decision trees. The empirical investigation, using 22 publicly available data sets, showed that the probability estimates from the Venn predictor were extremely well-calibrated. In fact, in a direct comparison using the accepted reliability metric, the Venn predictor estimates were the most exact on every data set. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/johansson18a.html https://proceedings.mlr.press/v91/johansson18a.html Conformal prediction in learning under privileged information paradigm with applications in drug discovery This paper explores conformal prediction in the learning under privileged information (LUPI) paradigm. We use the SVM$+$ realization of LUPI in an inductive conformal predictor, and apply it to the MNIST benchmark dataset and three datasets in drug discovery. The results show that using privileged information produces valid models and improves efficiency compared to standard SVM, however the improvement varies between the tested datasets and is not substantial in the drug discovery applications. More importantly, using SVM$+$ in a conformal prediction framework enables valid prediction intervals at specified significance levels. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/gauraha18a.html https://proceedings.mlr.press/v91/gauraha18a.html Preface Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/gammerman18a.html https://proceedings.mlr.press/v91/gammerman18a.html Detecting seizures in EEG recordings using conformal prediction This study examines the use of the Conformal Prediction (CP) framework for the provision of confidence information in the detection of seizures in electroencephalograph (EEG) recordings. The detection of seizures is an important task since EEG recordings of seizures are of primary interest in the evaluation of epileptic patients. However, manual review of long-term EEG recordings for detecting and analyzing seizures that may have occurred is a time-consuming process. Therefore a technique for automatic detection of seizures in such recordings is highly beneficial since it can be used to significantly reduce the amount of data in need of manual review. Additionally, due to the infrequent and unpredictable occurrence of seizures, having high sensitivity is crucial for seizure detection systems. This is the main motivation for this study, since CP can be used for controlling the error rate of predictions and therefore guaranteeing an upper bound on the frequency of false negatives. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/eliades18a.html https://proceedings.mlr.press/v91/eliades18a.html Exchangeability martingales for selecting features in anomaly detection We consider the problem of feature selection for unsupervised anomaly detection (AD) in time-series, where only normal examples are available for training. We develop a method based on exchangeability martingales that only keeps features that exhibit the same pattern (i.e., are i.i.d.) under normal conditions of the observed phenomenon. We apply this to the problem of monitoring a Windows service and detecting anomalies it exhibits if compromised; results show that our method: i) strongly improves the AD system’s performance, and ii) it reduces its computational complexity. Furthermore, it gives results that are easy to interpret for analysts, and it potentially increases robustness against AD evasion attacks. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/cherubin18a.html https://proceedings.mlr.press/v91/cherubin18a.html Venn-Abers predictors for improved compound iterative screening in drug discovery Iterative screening, where selected hits from a given round of screening are used to enrich a compound activity prediction model for the next iteration, enables more efficient screening campaigns. The portion of the compound library that should be screened in each iteration is often arbitrarily decided. This is because no accurate information between screening size and the number of hits to be retrieved exists. In this article, a novel method based on Venn-Abers predictors was used to determine the optimal number of compounds to be screened in order to get a desired number of hits. We found that Venn-Abers predictors provide accurate information to support a reliable and flexible decision about the portion size of the compound library that should be screened in each iteration. In addition, the method exhibited great ability in producing an enriched subset in terms of hits and their diversity. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/buendia18a.html https://proceedings.mlr.press/v91/buendia18a.html Using Venn-Abers predictors to assess cardio-vascular risk This study investigates a method for predicting compound risk based on in vitro assay data and estimated $C_\textitmax$, the maximum concentration of a drug in the body. The method makes use of Venn-Abers predictors and Support Vector Machines to compute compound risk with respect to a biological target. The method has been applied to in vitro ion-channel data generated to assess cardiac risk and introduces a more intuitive way to reflect cardiac risk. Thu, 07 Jun 2018 00:00:00 +0000 https://proceedings.mlr.press/v91/ahlberg18a.html https://proceedings.mlr.press/v91/ahlberg18a.html