Proofs as Explanations: Short Certificates for Reliable Predictions

Avrim Blum; Steve Hanneke; Chirag Pabbaraju; Donya Saless

Proofs as Explanations: Short Certificates for Reliable Predictions

Avrim Blum, Steve Hanneke, Chirag Pabbaraju, Donya Saless

Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:401-420, 2025.

Abstract

We consider a model for explainable AI in which an explanation for a prediction $h(x)=y$ consists of a subset $S’$ of the training data (if it exists) such that {\em all} classifiers $h’ \in \cH$ that make at most $b$ mistakes on $S’$ predict $h’(x)=y$. Such a set $S’$ serves as a {\em proof} that $x$ indeed has label $y$ under the assumption that (1) the true target function $h^\star$ belongs to $\cH$, and (2) the set $S$ contains at most $b$ noisy or corrupted points. For example, if $b=0$ and $\cH$ is the family of linear classifiers in $\mathbb{R}^d$, and if $x$ lies inside the convex hull of the positive data points in $S$ (and therefore every consistent linear classifier labels $x$ as positive), then Carathéodory’s theorem states that $x$ in fact lies inside the convex hull of $d+1$ of those points. So, a set $S’$ of size $d+1$ could be released as an explanation for a positive prediction, and would serve as a short proof of correctness of the prediction under the assumption of perfect realizability. In this work, we consider this problem more generally, for general hypothesis classes $\cH$ and general values $b\geq 0$. We define the notion of the {\em robust hollow star number} of $\cH$ (which generalizes the standard hollow star number), and show that it precisely characterizes the worst-case size of the smallest certificate achievable, and analyze its size for natural classes. We also consider worst-case distributional bounds on certificate size, as well as {\em distribution-dependent} bounds that we show tightly control the sample size needed to get a certificate for any given test example. In particular, we define a notion of the {\em certificate coefficient} $\eps_x$ of an example $x$ with respect to a data distribution $\cD$ and target function $h^\star$, and prove matching upper and lower bounds on sample size as a function of $\eps_x$, $b$, and the VC dimension $d$ of $\cH$.

Cite this Paper

BibTeX

@InProceedings{pmlr-v291-blum25a,
  title = 	 {Proofs as Explanations: Short Certificates for Reliable Predictions},
  author =       {Blum, Avrim and Hanneke, Steve and Pabbaraju, Chirag and Saless, Donya},
  booktitle = 	 {Proceedings of Thirty Eighth Conference on Learning Theory},
  pages = 	 {401--420},
  year = 	 {2025},
  editor = 	 {Haghtalab, Nika and Moitra, Ankur},
  volume = 	 {291},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Jun--04 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v291/main/assets/blum25a/blum25a.pdf},
  url = 	 {https://proceedings.mlr.press/v291/blum25a.html},
  abstract = 	 {We consider a model for explainable AI in which an explanation for a prediction $h(x)=y$ consists of a subset $S’$ of the training data (if it exists) such that {\em all} classifiers $h’ \in \cH$ that make at most  $b$ mistakes on $S’$ predict $h’(x)=y$.   Such a set $S’$ serves as a {\em proof} that $x$ indeed has label $y$ under the assumption that (1) the true target function $h^\star$ belongs to $\cH$, and (2) the set $S$ contains at most $b$ noisy or corrupted points.   For example, if $b=0$ and $\cH$ is the family of linear classifiers in $\mathbb{R}^d$, and if $x$ lies inside the convex hull of the positive data points in $S$ (and therefore every consistent linear classifier labels $x$ as positive), then Carathéodory’s theorem states that $x$ in fact lies inside the convex hull of $d+1$ of those points.  So, a set $S’$ of size $d+1$ could be released as an explanation for a positive prediction, and would serve as a short proof of correctness of the prediction under the assumption of perfect realizability.   In this work, we consider this problem more generally, for general hypothesis classes $\cH$ and general values $b\geq 0$.  We define the notion of the {\em robust hollow star number} of $\cH$ (which generalizes the standard hollow star number), and show that it precisely characterizes the worst-case size of the smallest certificate achievable, and analyze its size for natural classes. We also consider worst-case distributional bounds on certificate size, as well as {\em distribution-dependent} bounds that we show tightly control the sample size needed to get a certificate for any given test example.  In particular, we define a notion of the {\em certificate coefficient} $\eps_x$ of an example $x$ with respect to a data distribution $\cD$ and target function $h^\star$, and prove matching upper and lower bounds on sample size as a function of $\eps_x$, $b$, and the VC dimension $d$ of $\cH$.}
}

Endnote

%0 Conference Paper
%T Proofs as Explanations: Short Certificates for Reliable Predictions
%A Avrim Blum
%A Steve Hanneke
%A Chirag Pabbaraju
%A Donya Saless
%B Proceedings of Thirty Eighth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2025
%E Nika Haghtalab
%E Ankur Moitra	
%F pmlr-v291-blum25a
%I PMLR
%P 401--420
%U https://proceedings.mlr.press/v291/blum25a.html
%V 291
%X We consider a model for explainable AI in which an explanation for a prediction $h(x)=y$ consists of a subset $S’$ of the training data (if it exists) such that {\em all} classifiers $h’ \in \cH$ that make at most  $b$ mistakes on $S’$ predict $h’(x)=y$.   Such a set $S’$ serves as a {\em proof} that $x$ indeed has label $y$ under the assumption that (1) the true target function $h^\star$ belongs to $\cH$, and (2) the set $S$ contains at most $b$ noisy or corrupted points.   For example, if $b=0$ and $\cH$ is the family of linear classifiers in $\mathbb{R}^d$, and if $x$ lies inside the convex hull of the positive data points in $S$ (and therefore every consistent linear classifier labels $x$ as positive), then Carathéodory’s theorem states that $x$ in fact lies inside the convex hull of $d+1$ of those points.  So, a set $S’$ of size $d+1$ could be released as an explanation for a positive prediction, and would serve as a short proof of correctness of the prediction under the assumption of perfect realizability.   In this work, we consider this problem more generally, for general hypothesis classes $\cH$ and general values $b\geq 0$.  We define the notion of the {\em robust hollow star number} of $\cH$ (which generalizes the standard hollow star number), and show that it precisely characterizes the worst-case size of the smallest certificate achievable, and analyze its size for natural classes. We also consider worst-case distributional bounds on certificate size, as well as {\em distribution-dependent} bounds that we show tightly control the sample size needed to get a certificate for any given test example.  In particular, we define a notion of the {\em certificate coefficient} $\eps_x$ of an example $x$ with respect to a data distribution $\cD$ and target function $h^\star$, and prove matching upper and lower bounds on sample size as a function of $\eps_x$, $b$, and the VC dimension $d$ of $\cH$.

APA

Blum, A., Hanneke, S., Pabbaraju, C. & Saless, D.. (2025). Proofs as Explanations: Short Certificates for Reliable Predictions. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:401-420 Available from https://proceedings.mlr.press/v291/blum25a.html.

Related Material

Download PDF