[edit]
Proofs as Explanations: Short Certificates for Reliable Predictions
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:401-420, 2025.
Abstract
We consider a model for explainable AI in which an explanation for a prediction $h(x)=y$ consists of a subset $S’$ of the training data (if it exists) such that {\em all} classifiers $h’ \in \cH$ that make at most $b$ mistakes on $S’$ predict $h’(x)=y$. Such a set $S’$ serves as a {\em proof} that $x$ indeed has label $y$ under the assumption that (1) the true target function $h^\star$ belongs to $\cH$, and (2) the set $S$ contains at most $b$ noisy or corrupted points. For example, if $b=0$ and $\cH$ is the family of linear classifiers in $\mathbb{R}^d$, and if $x$ lies inside the convex hull of the positive data points in $S$ (and therefore every consistent linear classifier labels $x$ as positive), then Carathéodory’s theorem states that $x$ in fact lies inside the convex hull of $d+1$ of those points. So, a set $S’$ of size $d+1$ could be released as an explanation for a positive prediction, and would serve as a short proof of correctness of the prediction under the assumption of perfect realizability. In this work, we consider this problem more generally, for general hypothesis classes $\cH$ and general values $b\geq 0$. We define the notion of the {\em robust hollow star number} of $\cH$ (which generalizes the standard hollow star number), and show that it precisely characterizes the worst-case size of the smallest certificate achievable, and analyze its size for natural classes. We also consider worst-case distributional bounds on certificate size, as well as {\em distribution-dependent} bounds that we show tightly control the sample size needed to get a certificate for any given test example. In particular, we define a notion of the {\em certificate coefficient} $\eps_x$ of an example $x$ with respect to a data distribution $\cD$ and target function $h^\star$, and prove matching upper and lower bounds on sample size as a function of $\eps_x$, $b$, and the VC dimension $d$ of $\cH$.