[edit]

# Valid Inferential Models for Prediction in Supervised Learning Problems

*Proceedings of the Twelveth International Symposium on Imprecise Probability: Theories and Applications*, PMLR 147:72-82, 2021.

#### Abstract

Prediction, where observed data is used to quantify uncertainty about a future observation, is a fundamental problem in statistics. Prediction sets with coverage probability guarantees are a common solution, but these do not provide probabilistic uncertainty quantification in the sense of assigning beliefs to relevant assertions about the future observable. Alternatively, we recommend the use of a

*probabilistic predictor*, a fully-specified (imprecise) probability distribution for the to-be-predicted observation given the observed data. It is essential that the probabilistic predictor is reliable or valid in some sense, and here we offer a notion of validity and explore its implications. We also provide a general inferential model construction that yields a provably valid probabilistic predictor, with illustrations in regression and classification.