Testing Marginal and Conditional Coverage in Conformal Prediction for Non-Stationary Time Series via Value-at-Risk Backtesting

Konrad Retzlaff, Filip Schlembach, Dennis Bams, Philippe Dreesen
Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications, PMLR 266:725-747, 2025.

Abstract

Conformal Prediction (CP) constructs prediction intervals with marginal coverage guarantees under the assumption of exchangeability, yet it has also been widely applied to non-exchangeable settings such as time series, where temporal dependence and distribution shifts often violate this assumption. Despite this, CP methods are typically evaluated using descriptive metrics like empirical coverage and average interval width, without formal statistical testing. This lack of hypothesis-driven evaluation makes it unclear whether deviations are meaningful or due to random variation. We address this gap by establishing a formal equivalence between CP and Value-at-Risk (VaR), enabling the use of VaR-style backtesting methods to statistically assess both marginal and conditional coverage. Additionally, we incorporate Diebold-Mariano tests with interval scores to compare predictive performance. Applied to synthetic, electricity, and financial time series, our framework uncovers violation and adaptation issues overlooked by standard metrics. The Dynamic Binary Test and the Geometric Conformal Backtesting, in particular, identifies covariate-and drift-induced dependence and miscalibration, offering a sharper lens for evaluating CP methods in non-stationary settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v266-retzlaff25a, title = {Testing Marginal and Conditional Coverage in Conformal Prediction for Non-Stationary Time Series via Value-at-Risk Backtesting}, author = {Retzlaff, Konrad and Schlembach, Filip and Bams, Dennis and Dreesen, Philippe}, booktitle = {Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications}, pages = {725--747}, year = {2025}, editor = {Nguyen, Khuong An and Luo, Zhiyuan and Papadopoulos, Harris and Löfström, Tuwe and Carlsson, Lars and Boström, Henrik}, volume = {266}, series = {Proceedings of Machine Learning Research}, month = {10--12 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v266/main/assets/retzlaff25a/retzlaff25a.pdf}, url = {https://proceedings.mlr.press/v266/retzlaff25a.html}, abstract = {Conformal Prediction (CP) constructs prediction intervals with marginal coverage guarantees under the assumption of exchangeability, yet it has also been widely applied to non-exchangeable settings such as time series, where temporal dependence and distribution shifts often violate this assumption. Despite this, CP methods are typically evaluated using descriptive metrics like empirical coverage and average interval width, without formal statistical testing. This lack of hypothesis-driven evaluation makes it unclear whether deviations are meaningful or due to random variation. We address this gap by establishing a formal equivalence between CP and Value-at-Risk (VaR), enabling the use of VaR-style backtesting methods to statistically assess both marginal and conditional coverage. Additionally, we incorporate Diebold-Mariano tests with interval scores to compare predictive performance. Applied to synthetic, electricity, and financial time series, our framework uncovers violation and adaptation issues overlooked by standard metrics. The Dynamic Binary Test and the Geometric Conformal Backtesting, in particular, identifies covariate-and drift-induced dependence and miscalibration, offering a sharper lens for evaluating CP methods in non-stationary settings.} }
Endnote
%0 Conference Paper %T Testing Marginal and Conditional Coverage in Conformal Prediction for Non-Stationary Time Series via Value-at-Risk Backtesting %A Konrad Retzlaff %A Filip Schlembach %A Dennis Bams %A Philippe Dreesen %B Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications %C Proceedings of Machine Learning Research %D 2025 %E Khuong An Nguyen %E Zhiyuan Luo %E Harris Papadopoulos %E Tuwe Löfström %E Lars Carlsson %E Henrik Boström %F pmlr-v266-retzlaff25a %I PMLR %P 725--747 %U https://proceedings.mlr.press/v266/retzlaff25a.html %V 266 %X Conformal Prediction (CP) constructs prediction intervals with marginal coverage guarantees under the assumption of exchangeability, yet it has also been widely applied to non-exchangeable settings such as time series, where temporal dependence and distribution shifts often violate this assumption. Despite this, CP methods are typically evaluated using descriptive metrics like empirical coverage and average interval width, without formal statistical testing. This lack of hypothesis-driven evaluation makes it unclear whether deviations are meaningful or due to random variation. We address this gap by establishing a formal equivalence between CP and Value-at-Risk (VaR), enabling the use of VaR-style backtesting methods to statistically assess both marginal and conditional coverage. Additionally, we incorporate Diebold-Mariano tests with interval scores to compare predictive performance. Applied to synthetic, electricity, and financial time series, our framework uncovers violation and adaptation issues overlooked by standard metrics. The Dynamic Binary Test and the Geometric Conformal Backtesting, in particular, identifies covariate-and drift-induced dependence and miscalibration, offering a sharper lens for evaluating CP methods in non-stationary settings.
APA
Retzlaff, K., Schlembach, F., Bams, D. & Dreesen, P.. (2025). Testing Marginal and Conditional Coverage in Conformal Prediction for Non-Stationary Time Series via Value-at-Risk Backtesting. Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications, in Proceedings of Machine Learning Research 266:725-747 Available from https://proceedings.mlr.press/v266/retzlaff25a.html.

Related Material