Proceedings of Machine Learning Research

Proceedings of Machine Learning Research Proceedings of the Eighth Symposium on Conformal and Probabilistic Prediction and Applications on 09-11 September 2019 Published as Volume 105 by the Proceedings of Machine Learning Research on 29 August 2019. Volume Edited by: Alex Gammerman Vladimir Vovk Zhiyuan Luo Evgueni Smirnov Series Editors: Neil D. Lawrence Mark Reid https://proceedings.mlr.press/v105/ Wed, 08 Feb 2023 10:35:17 +0000 Wed, 08 Feb 2023 10:35:17 +0000 Jekyll v3.9.3 Ensembles based on conformal instance transfer In this paper we propose a new ensemble method based on conformal instance transfer. The method combines feature selection and source-instance selection to avoid negative transfer in a model-independent way. It was tested experimentally for different types of classifiers on several benchmark data sets. The experiment results demonstrate that the new method is capable of outperforming significantly standard instance transfer methods. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/zhou19a.html https://proceedings.mlr.press/v105/zhou19a.html Universally Consistent Conformal Predictive Distributions This paper describes conformal predictive systems that are universally consistent in the sense of being consistent under any data-generating distribution, assuming that the observations are produced independently in the IID fashion. Being conformal, these predictive systems satisfy a natural property of small-sample validity, namely they are automatically calibrated in probability. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/vovk19a.html https://proceedings.mlr.press/v105/vovk19a.html Online Learning with Continuous Ranked Probability Score Probabilistic forecasts in the form of probability distributions over future events have become popular in several fields of statistical science. The dissimilarity between a probability forecast and an outcome is measured by a loss function (scoring rule). Popular example of scoring rule for continuous outcomes is the continuous ranked probability score (CRPS). We consider the case where several competing methods produce online predictions in the form of probability distribution functions. In this paper, the problem of combining probabilistic forecasts is considered in the prediction with expert advice framework. We show that CRPS is a mixable loss function and then the time-independent upper bound for the regret of the Vovk’s Aggregating Algorithm using CRPS as a loss function can be obtained. We present the results of numerical experiments illustrating the proposed methods. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/v-yugin19a.html https://proceedings.mlr.press/v105/v-yugin19a.html Conformal predictor combination using Neyman–Pearson Lemma The problem of how to combine advantageously Conformal Predictors (CP) has attracted the interest of many researchers in recent years. The challenge is to retain validity, while improving efficiency. In this article a very generic method is proposed which takes advantage of a well-established result in Classical Statistical Hypothesis Testing, the Neyman–Pearson Lemma, to combine CP with maximum efficiency. The merits and the limits of the method are explored on synthetic data sets under different levels of correlation between NonConformity Measures (NCM). CP Combination via Neyman–Pearson Lemma generally outperforms other combination methods when an accurate and robust density ratio estimation method, such as the V-Matrix method, is used. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/toccaceli19a.html https://proceedings.mlr.press/v105/toccaceli19a.html Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets Conformal Prediction is a framework that produces prediction intervals based on the output from a machine learning algorithm. In this paper we explore the case when training data is made up of multiple parts available in different sources that cannot be pooled. We here consider the regression case and propose a method where a conformal predictor is trained on each data source independently, and where the prediction intervals are then combined into a single interval. We call the approach Non-Disclosed Conformal Prediction (NDCP), and we evaluate it on a regression dataset from the UCI machine learning repository using support vector regression as the underlying machine learning algorithm, with varying number of data sources and sizes. The results show that the proposed method produces conservatively valid prediction intervals, and while we cannot retain the same efficiency as when all data is used, efficiency is improved through the proposed approach as compared to predicting using a single arbitrarily chosen source. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/spjuth19a.html https://proceedings.mlr.press/v105/spjuth19a.html Coreset-based Conformal Prediction for Large-scale Learning As the volume of data increase rapidly, most traditional machine learning algorithms become computationally prohibitive. Furthermore, the available data can be so big that a single machine’s memory can easily be overflown. We propose Coreset-Based Conformal Prediction, a strategy for dealing with big data by applying conformal predictors to a weighted summary of data—namely the coreset. We compare our approach against standalone inductive conformal predictors over three large competition-grade datasets to demonstrate that our coreset-based strategy may not only significantly improve the learning speed, but also retains predictions validity and the predictors’ efficiency. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/riquelme-granada19a.html https://proceedings.mlr.press/v105/riquelme-granada19a.html A Deep Neural Network Conformal Predictor for Multi-label Text Classification We investigate the use of inductive conformal prediction (ICP) for the task of multi-label text classification and present preliminary experimental results for a subset of the original Reuters-21578 data-set. Our underlying classification model is a deep neural network configuration which consists of a trainable embedding layer, a convolutional layer and two dense feed-forward layers, arranged sequentially, with sigmoid outputs representing the individual unique labels of the selected subset. Following the power-set approach, we assign nonconformity scores to label-sets from which the corresponding p-values and prediction-sets are determined and we experiment with a number of different versions of a nonconformity measure. Our results indicate a good performance for the underlying model which is carried on to the ICP without any significant accuracy loss and with the added benefits of prediction-specific confidence information. Prediction-sets are tight enough to be practically useful even though the multi-label subset contains tens of thousands of possible label combinations and empirical error-rates confirm that our outputs are well-calibrated. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/paisios19a.html https://proceedings.mlr.press/v105/paisios19a.html Conformal Prediction for Students’ Grades in a Course Recommender System Course selection can be challenging for students of Liberal Arts programs. In particular, due to the highly personalized curricula of these students, it is often difficult to assess whether or not a particular course is too advanced given their academic background. To assist students of the liberal arts program of the University College Maastricht, Morsomme and Vazquez (2019) developed a course recommender system that suggests courses whose content matches the student’s academic interests, and issues warnings for courses that it deems too advanced. To issue warnings, the system produces point predictions for the grades that a student will receive in the courses that she/he is considering for the following term. Point predictions are estimated with regression models specific to each course which take into account the academic performance of the student along with the knowledge that she/he has acquired in previous courses. A warning is issued if the predicted grade is a fail. In this paper, we complement the system’s point predictions for grades with prediction intervals constructed using the conformal prediction framework (Vovk et al., 2005). We use the Inductive Confidence Machine (ICM) (Papadopoulos et al., 2002) with normalized nonconformity scores to construct prediction intervals that are tailored to each student. We find that the prediction intervals constructed with the ICM are valid and that their widths are related to the accuracy of the underlying regression model. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/morsomme19a.html https://proceedings.mlr.press/v105/morsomme19a.html Interpretable and specialized conformal predictors In real-world scenarios, interpretable models are often required to explain predictions, and to allow for inspection and analysis of the model. The overall purpose of oracle coaching is to produce highly accurate, but interpretable, models optimized for a specific test set. Oracle coaching is applicable to the very common scenario where explanations and insights are needed for a specific batch of predictions, and the input vectors for this test set are available when building the predictive model. In this paper, oracle coaching is used for generating underlying classifiers for conformal prediction. The resulting conformal classifiers output valid label sets, i.e., the error rate on the test data is bounded by a preset significance level, as long as the labeled data used for calibration is exchangeable with the test set. Since validity is guaranteed for all conformal predictors, the key performance metric is efficiency, i.e., the size of the label sets, where smaller sets are more informative. The main contribution of this paper is the design of setups making sure that when oracle-coached decision trees, that per definition utilize knowledge about test data, are used as underlying models for conformal classifiers, the exchangeability between calibration and test data is maintained. Consequently, the resulting conformal classifiers retain the validity guarantees. In the experimentation, using a large number of publicly available data sets, the validity of the suggested setups is empirically demonstrated. Furthermore, the results show that the more accurate underlying models produced by oracle coaching also improved the efficiency of the corresponding conformal classifiers. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/johansson19a.html https://proceedings.mlr.press/v105/johansson19a.html Test statistics and p-values We point out that the traditional notion of test statistic is too narrow, even for the purpose of conformal prediction. The most natural generalization of the traditional notion happens to be too wide. We propose another natural generalization which is arguably the widest reasonable generalization. The study is restricted to simple statistical hypotheses. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/gurevich19a.html https://proceedings.mlr.press/v105/gurevich19a.html Split knowledge transfer in learning under privileged information framework Learning Under Privileged Information (LUPI) enables the inclusion of additional (privileged) information when training machine learning models, data that is not available when making predictions. The methodology has been successfully applied to a diverse set of problems from various fields. SVM+ was the first realization of the LUPI paradigm which showed fast convergence but did not scale well. To address the scalability issue, knowledge transfer approaches were proposed to estimate privileged information from standard features in order to construct improved decision rules. Most available knowledge transfer methods use regression techniques and the same data for approximating the privileged features as for learning the transfer function. Inspired by the cross-validation approach, we propose to partition the training data into $K$ folds and use each fold for learning a transfer function and the remaining folds for approximations of privileged features—we refer to this as split knowledge transfer. We evaluate the method using four different experimental setups comprising one synthetic and three real datasets. The results indicate that our approach leads to improved accuracy as compared to LUPI with standard knowledge transfer. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/gauraha19a.html https://proceedings.mlr.press/v105/gauraha19a.html Preface Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/gammerman19a.html https://proceedings.mlr.press/v105/gammerman19a.html Applying Conformal Prediction to Control an Exoskeleton This paper investigates the use of the Conformal Prediction (CP) framework for providing confidence measures to assist a Brain Machine Interface (BMI) in the task of controlling an exoskeleton using electroencephalogram (EEG) and electrooculogram (EOG) clips. Reliable and accurate control of assistive robotics is still an important challenge because of the noisy nature of EEG’s and EOG’s and the fact that any misclassification can lead to unwanted actions and serious safety risks. Therefore a technique that will compliment predictions with a well-calibrated indication of how correct they are, should be very beneficial for the particular application as it can significantly enhance safety. Our approach consists of an Inductive Conformal Predictor (ICP) built on top of a Bidirectional Long Short Term Memory (BiLSTM) Neural Network. We conduct experiments on a dataset consisting of EEG and EOG data collected from one subject with a high spinal cord lesion. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/eliades19a.html https://proceedings.mlr.press/v105/eliades19a.html Competitive Online Regression under Continuous Ranked Probability Score We consider the framework of competitive prediction when one provides guarantees compared to other predictive models that are called experts. We propose the algorithm that combines point predictions of an infinite pool of linear experts and outputs probability forecasts in the form of cumulative distribution functions. We evaluate the quality of probabilistic prediction by the continuous ranked probability score (CRPS), which is a widely used proper scoring rule. We provide a strategy that allows us to “track the best expert” and derive the theoretical bound on the discounted loss of the strategy. Experimental results on synthetic data and solar power data show that the theoretical bounds of our algorithm are not violated. Also the algorithm performs close to and sometimes outperforms the retrospectively best quantile regression. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/dzhamtyrova19a.html https://proceedings.mlr.press/v105/dzhamtyrova19a.html Predicting with Confidence from Survival Data Survival modeling concerns predicting whether or not an event will occur before or on a given point in time. In a recent study, the conformal prediction framework was applied to this task, and so-called conformal random survival forest was proposed. It was empirically shown that the error level of this model indeed is very close to the provided confidence level, and also that the error for predicting each outcome, i.e., event or no-event, can be controlled separately by employing a Mondrian approach. The addressed task concerned making predictions for time points as provided by the underlying distribution. However, if one instead is interested in making predictions with respect to some specific time point, the guarantee of the conformal prediction framework no longer holds, as one is effectively considering a sample from another distribution than from which the calibration instances have been drawn. In this study, we propose a modification of the approach for specific time points, which transforms the problem into a binary classification task, thereby allowing the error level to be controlled. The latter is demonstrated by an empirical investigation using both a collection of publicly available datasets and two in-house datasets from a truck manufacturing company. Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/bostrom19a.html https://proceedings.mlr.press/v105/bostrom19a.html Abstracts of invited talks and posters Thu, 29 Aug 2019 00:00:00 +0000 https://proceedings.mlr.press/v105/balinsky19a.html https://proceedings.mlr.press/v105/balinsky19a.html