Proceedings of Machine Learning ResearchProceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications
Held in Virtual on 08-10 September 2021
Published as Volume 152 by the Proceedings of Machine Learning Research on 20 September 2021.
Volume Edited by:
Lars Carlsson
Zhiyuan Luo
Giovanni Cherubin
Khuong An Nguyen
Series Editors:
Neil D. Lawrence
* Mark Reid
https://proceedings.mlr.press/v152/
Mon, 20 Sep 2021 11:26:14 +0000Mon, 20 Sep 2021 11:26:14 +0000Jekyll v3.9.0Confidence machine learning for cutting tool life predictionThe work aims to develop an automatic cutting tool life prediction model for die-cuts machine at Parafix. Such model will be able to estimate how long a given tool is likely to last, in order to improve performance and productivity. This work is part of the KTP project between Parafix and University of Brighton.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/wilson21a.html
https://proceedings.mlr.press/v152/wilson21a.htmlEvaluation of updating strategies for conformal predictive systems in the presence of extreme eventsSix different strategies for updating split conformal predictive systems in an online (streaming) setting are evaluated. The updating strategies vary in the extent and frequency of retraining as well as in how training data is split into proper training and calibration sets. An empirical evaluation is presented, considering passenger booking data from a ferry company, which stretches over a number of years. The passenger volumes have changed drastically during 2020 due to COVID-19 and part of the evaluation is focusing on which updating strategies work best under such circumstances. Some strategies are observed to outperform others with respect to continuous ranked probability score and validity, highlighting the potential value of choosing a proper strategy.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/werner21a.html
https://proceedings.mlr.press/v152/werner21a.htmlProtected probabilistic classificationThis poster proposes a way of protecting algorithms for probabilistic binary classification against changes in the data distribution.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/vovk21c.html
https://proceedings.mlr.press/v152/vovk21c.htmlRetrain or not retrain: conformal test martingales for change-point detectionWe argue for supplementing the process of training a prediction algorithm by setting up a scheme for detecting the moment when the distribution of the data changes and the algorithm needs to be retrained. Our proposed schemes are based on exchangeability martingales, i.e., processes that are martingales under any exchangeable distribution for the data. Our method, based on conformal prediction, is general and can be applied on top of any modern prediction algorithm. Its validity is guaranteed, and in this paper we make first steps in exploring its efficiency.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/vovk21b.html
https://proceedings.mlr.press/v152/vovk21b.htmlConformal testing in a binary model situationConformal testing is a way of testing the IID assumption based on conformal prediction. The topic of this paper is experimental evaluation of the performance of conformal testing in a model situation in which IID binary observations generated from a Bernoulli distribution are followed by IID binary observations generated from another Bernoulli distribution, with the parameters of the distributions and changepoint known or unknown. Existing conformal test martingales can be used for this task and work well in simple cases, but their efficiency can be improved greatly.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/vovk21a.html
https://proceedings.mlr.press/v152/vovk21a.htmlConformal changepoint detection in continuous model situationsConformal prediction provides a way of testing the IID assumption, which is the standard assumption in machine learning. A natural question is whether this way of testing is efficient. A typical situation where the IID assumption is broken is the existence of a changepoint at which the distribution of the data changes. We study the case of a change from one continuous distribution to another with both distributions belonging to standard parametric families. Our conclusion is that the conformal approach to testing the IID assumption is efficient, at least to some degree.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/nouretdinov21a.html
https://proceedings.mlr.press/v152/nouretdinov21a.htmlClass-wise confidence for debt prediction in real estate management: discussion and lessons learned from an applicationThe prediction of tenants likely to fall into a debt situation is a key issue for social property owners in real estate. It is even more important for them to limit the number of people falsely predicted to be in debt to avoid incurring unnecessary costs (in time and money), for instance by sending agents to prevent the debt. In this paper, we adapt Mondrian conformal prediction to control the error rate of this class, while keeping a level of confidence chosen by the social property owner, or more generally by the user. We also test this small adaptation with different splitting strategies and discuss the obtained results, those later showing promising results, in the sense that they show that our approach can work, as well as pointing out and discussing difficulties, in the sense that conformal prediction fails on some settings of particular interest to the end-user.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/messoudi21a.html
https://proceedings.mlr.press/v152/messoudi21a.htmlConformal uncertainty sets for robust optimizationDecision-making under uncertainty is hugely important for any decisions sensitive to perturbations in observed data. One method of incorporating uncertainty into making optimal decisions is through robust optimization, which minimizes the worst-case scenario over some \emph{uncertainty set}. We connect conformal prediction regions to robust optimization, providing finite sample valid and conservative ellipsoidal uncertainty sets, aptly named conformal uncertainty sets. In pursuit of this connection we explicitly define Mahalanobis distance as a potential conformity score in full conformal prediction. We also compare the coverage and optimization performance of conformal uncertainty sets, specifically generated with Mahalanobis distance, to traditional ellipsoidal uncertainty sets on a collection of simulated robust optimization examples.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/johnstone21a.html
https://proceedings.mlr.press/v152/johnstone21a.htmlCalibrating multi-class modelsPredictive models communicating algorithmic confidence are very informative, but only if well-calibrated and sharp, i.e., providing accurate probability estimates adjusted for each instance. While almost all machine learning algorithms are able to produce probability estimates, these are often poorly calibrated, thus requiring external calibration. For multiclass problems, external calibration has typically been done using one-vs-all or all-vs-all schemes, thus adding to the computational complexity, but also making it impossible to analyze and inspect the predictive models. In this paper, we suggest a novel approach for calibrating inherently multi-class models. Instead of providing a probability distribution over all labels, the estimation is of the probability that the class label predicted by the underlying model is correct. In an extensive empirical study, it is shown that the suggested approach, when applied to both Platt scaling and Venn-Abers, is able to improve the probability estimates from decision trees, random forests and extreme gradient boosting.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/johansson21a.html
https://proceedings.mlr.press/v152/johansson21a.htmlShapley-value based inductive conformal predictionShapley values of individual instances were recently proposed for the problem of data valuation. They were defined as the average marginal instance contributions to the performance of a given predictor. In this paper we propose to use Shapley values of individual instances as conformity scores. To compute these values efficiently and exactly we employ a standard algorithm based on nearest neighbor classification and propose a variant of this algorithm for clustered data. Both variants are used for computing Shapley conformity scores for inductive conformal predictors. The experiments show that the Shapley-value conformity scores result in smaller prediction sets for significance level $\epsilon \leq 0.1$ compared with those produced by standard conformity scores (i.e. similarity between true and predicted output values).Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/jaramillo21a.html
https://proceedings.mlr.press/v152/jaramillo21a.htmlConformal prediction and its integration within visual analytics toolboxConformal prediction is a machine learning approach to report on the reliability of predictive models when applied to new cases. Machine learning techniques are gaining in complexity, and assessing their reliability may be an essential part of explaining the inner workings of predictive models. For practical purposes and dissemination of conformal prediction techniques, we must include these within easily accessible toolboxes. In machine learning, a significant subset of such toolboxes is those that use work flows and visual programming. Here, we report on an example of such a toolbox, Python implementation of conformal prediction library, and our initial efforts and ideas to democratize conformal prediction.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/hocevar21a.html
https://proceedings.mlr.press/v152/hocevar21a.htmlTransformer-based conformal predictors for paraphrase detectionTransformer architectures have established themselves as the state-of-the-art in many areas of natural language processing (NLP), including paraphrase detection (PD). However, they do not include a confidence estimation for each prediction and, in many cases, the applied models are poorly calibrated. These features are essential for numerous real-world applications. For example, in those cases when PD is used for sensitive tasks, like plagiarism detection, hate speech recognition or in medical NLP, mistakes might be very costly. In this work we build several variants of transformer- based conformal predictors and study their behaviour on a standard PD dataset. We show that our models are able to produce \emph{valid} predictions while retaining the accuracy of the original transformer-based models. The proposed technique can be extended to many more NLP problems that are currently being investigated.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/giovannotti21a.html
https://proceedings.mlr.press/v152/giovannotti21a.htmlSynergy conformal predictionConformal prediction is a machine learning methodology that produces valid prediction regions. Ensembles of conformal predictors have been proposed to improve the informational efficiency of inductive conformal predictors by combining p-values, however, the validity of such methods has been an open problem. We introduce synergy conformal prediction which is an ensemble method that combines monotonic conformity scores, and is capable of producing valid prediction intervals. We study the applicability in three scenarios; where data is partitioned, where an ensemble of different machine learning methods is used, and where data is unpartitioned. We evaluate the method on 10 data sets and show that the synergy conformal predictor produces valid prediction intervals that on partitioned data performs well compared to the most efficient model trained on individual partitions, making it a viable approach for federated settings when data cannot be pooled. We also show that our method has advantages over current ensembles of conformal predictors by producing valid and efficient results on unpartitioned data, and that it is less computationally demanding.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/gauraha21a.html
https://proceedings.mlr.press/v152/gauraha21a.htmlUsing inductive conformal martingales for addressing concept drift in data stream classificationIn this paper, we investigate the use of Inductive Conformal Martingales (ICM) with the histogram betting function for detecting the occurrence of concept drift (CD) in data stream classification. A change in the data distribution will almost surely affect the performance of our classification model resulting in false predictions. Therefore, a reliable and fast detection of the point at which a CD occurs, allows effective retraining of the model to recover accuracy. Our approach is based on ICM with the histogram betting function, which is much more computationally efficient than alternative ICM approaches. To accelerate the process of detecting CD we also modify the ICM and examine different parameters of the histogram betting function. We evaluate the proposed approach on three benchmark datasets, namely STAGGER, SEA and ELEC, presenting different measures of its performance and comparing it with existing methods in the literature.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/eliades21a.html
https://proceedings.mlr.press/v152/eliades21a.htmlA lower bound for a prediction algorithm under the Kullback-Leibler gameWe obtain a lower bound for an algorithm predicting finite-dimensional distributions (i.e., points from a simplex) under Kullback-Leibler loss. The bound holds w.r.t. the class of softmax linear predictors. We then show that the bound is asymptotically matched by the Bayesian universal algorithm.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/dzhamtyrova21a.html
https://proceedings.mlr.press/v152/dzhamtyrova21a.htmlA non-conformity approach towards post-prostatectomy metastasis estimation using a multicentre prostate cancer databaseProstate cancer is among the most common type of cancer in men worldwide. Despite the use of clinical indicators, as part of simple rule-based strategies, stratifying patients diagnosed with prostate cancer into risk groups to reliably reflect oncological prognosis remains challenging. Machine Learning (ML) offers the possibility to develop estimation models based on routinely evaluated patient or tumor characteristics. In the present study, the estimation of metastasis in prostate patients after primary treatments (radical prostatectomy) with the aid of Support Vector Machines (SVMs) and Conformal Predictors (CP) was evaluated. We show that the use of ML models can complement classical statistical approaches. Moreover, the application of CP, on top of an underlying ML model, renders a probabilistic outcome that combines the simplicity of a clinical indicator with the precision of a ML approach. The TriNetX Research Network, an electronic health records database with datasets from several United States health care organizations, was used in this study. This approach can be further adapted to support clinical decision making in prostate and other types of cancer.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/chatzichristos21a.html
https://proceedings.mlr.press/v152/chatzichristos21a.htmlPrefaceMon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/carlsson21a.html
https://proceedings.mlr.press/v152/carlsson21a.htmlMondrian conformal predictive distributionsThe distributions output by a standard (non-normalized) conformal predictive system all have the same shape but differ in location, while a normalized conformal predictive system outputs distributions that differ also in shape, through rescaling. An approach to further increasing the flexibility of the framework is proposed, called \emph{Mondrian conformal predictive distributions}, which are (standard or normalized) conformal predictive distributions formed from multiple Mondrian categories. The effectiveness of the approach is demonstrated with an application to regression forests. By forming categories through binning of the predictions, it is shown that for this model class, the use of Mondrian conformal predictive distributions significantly outperforms the use of both standard and normalized conformal predictive distributions with respect to the continuous- ranked probability score. It is further shown that the use of Mondrian conformal predictive distributions results in as tight prediction intervals as produced by normalized conformal regressors, while improving upon the point predictions of the underlying regression forest.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/bostrom21a.html
https://proceedings.mlr.press/v152/bostrom21a.htmlFast conformal classification using influence functionsWe use influence functions from robust statistics to speed up full conformal prediction. Traditionally, conformal prediction requires retraining multiple leave-one-out classifiers to calculate p-values for each test point. By using influence functions, we are able to approximate this procedure and to speed up considerably the time complexity.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/bhatt21a.html
https://proceedings.mlr.press/v152/bhatt21a.htmlApproximation to object conditional validity with inductive conformal predictorsConformal predictors are machine learning algorithms that output prediction sets that have a guarantee of marginal validity for finite samples with minimal distributional assumptions. This is a property that makes conformal predictors useful for machine learning tasks where we require reliable predictions. It would also be desirable to achieve conditional validity in the same setting, in the sense that validity of the prediction intervals remains true regardless of conditioning on any particular property of the object of the prediction. Unfortunately, it has been shown that such conditional validity is impossible to guarantee for non-trivial prediction problems for finite samples. In this article, instead of trying to achieve a strong conditional validity guarantee, an \emph{approximation} to conditional validity is considered and measured empirically. A new algorithm is introduced to do this by iteratively adjusting a conformity measure to deviations from object conditional validity measured in the training data. Experimental results are provided for three data sets that demonstrate (1) in real world machine learning tasks, lack of conditional validity is a measurable problem and (2) that the proposed algorithm is effective at alleviating this problem.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/bellotti21a.html
https://proceedings.mlr.press/v152/bellotti21a.htmlImpact of model-agnostic nonconformity functions on efficiency of conformal classifiers: an extensive studyThe property of conformal predictors to guarantee the required accuracy rate makes this framework attractive in various practical applications. However, this property is achieved at a price of reduction in precision. In the case of conformal classification, the system can output multiple class labels instead of one. It is also known, that the choice of nonconformity function has a major impact on the efficiency of conformal classifiers. Recently, it was shown that different model-agnostic nonconformity functions result in conformal classifiers with different characteristics. For a Neural Network-based conformal classifier, the \emph{inverse probability} (or hinge loss) allows minimizing the average number of predicted labels, and \emph{margin} results in a larger fraction of singleton predictions. In this work, we aim to further extend this study. We perform an experimental evaluation using 8 different classification algorithms and discuss when the previously observed relationship holds or not. Additionally, we propose a successful method to combine the properties of these two nonconformity functions.Mon, 20 Sep 2021 00:00:00 +0000
https://proceedings.mlr.press/v152/aleksandrova21a.html
https://proceedings.mlr.press/v152/aleksandrova21a.html