Measuring Stochastic Data Complexity with Boltzmann Influence Functions

Nathan Hoyen Ng, Roger Baker Grosse, Marzyeh Ghassemi
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:37553-37569, 2024.

Abstract

Estimating the uncertainty of a model’s prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts.A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also consistent with the model and training data. In this work we propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function. IF-COMP can be used to produce well-calibrated predictions on test points as well as measure complexity in both labelled and unlabelled settings. We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-ng24b, title = {Measuring Stochastic Data Complexity with Boltzmann Influence Functions}, author = {Ng, Nathan Hoyen and Grosse, Roger Baker and Ghassemi, Marzyeh}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {37553--37569}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/ng24b/ng24b.pdf}, url = {https://proceedings.mlr.press/v235/ng24b.html}, abstract = {Estimating the uncertainty of a model’s prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts.A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also consistent with the model and training data. In this work we propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function. IF-COMP can be used to produce well-calibrated predictions on test points as well as measure complexity in both labelled and unlabelled settings. We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.} }
Endnote
%0 Conference Paper %T Measuring Stochastic Data Complexity with Boltzmann Influence Functions %A Nathan Hoyen Ng %A Roger Baker Grosse %A Marzyeh Ghassemi %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-ng24b %I PMLR %P 37553--37569 %U https://proceedings.mlr.press/v235/ng24b.html %V 235 %X Estimating the uncertainty of a model’s prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts.A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also consistent with the model and training data. In this work we propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function. IF-COMP can be used to produce well-calibrated predictions on test points as well as measure complexity in both labelled and unlabelled settings. We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.
APA
Ng, N.H., Grosse, R.B. & Ghassemi, M.. (2024). Measuring Stochastic Data Complexity with Boltzmann Influence Functions. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:37553-37569 Available from https://proceedings.mlr.press/v235/ng24b.html.

Related Material