Importance Sampling for Nonlinear Models

Prakash Palanivelu Rajmohan, Fred Roosta
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:51039-51059, 2025.

Abstract

While norm-based and leverage-score-based methods have been extensively studied for identifying "important" data points in linear models, analogous tools for nonlinear models remain significantly underdeveloped. By introducing the concept of the adjoint operator of a nonlinear map, we address this gap and generalize norm-based and leverage-score-based importance sampling to nonlinear settings. We demonstrate that sampling based on these generalized notions of norm and leverage scores provides approximation guarantees for the underlying nonlinear mapping, similar to linear subspace embeddings. As direct applications, these nonlinear scores not only reduce the computational complexity of training nonlinear models by enabling efficient sampling over large datasets but also offer a novel mechanism for model explainability and outlier detection. Our contributions are supported by both theoretical analyses and experimental results across a variety of supervised learning scenarios.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-rajmohan25a, title = {Importance Sampling for Nonlinear Models}, author = {Rajmohan, Prakash Palanivelu and Roosta, Fred}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {51039--51059}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/rajmohan25a/rajmohan25a.pdf}, url = {https://proceedings.mlr.press/v267/rajmohan25a.html}, abstract = {While norm-based and leverage-score-based methods have been extensively studied for identifying "important" data points in linear models, analogous tools for nonlinear models remain significantly underdeveloped. By introducing the concept of the adjoint operator of a nonlinear map, we address this gap and generalize norm-based and leverage-score-based importance sampling to nonlinear settings. We demonstrate that sampling based on these generalized notions of norm and leverage scores provides approximation guarantees for the underlying nonlinear mapping, similar to linear subspace embeddings. As direct applications, these nonlinear scores not only reduce the computational complexity of training nonlinear models by enabling efficient sampling over large datasets but also offer a novel mechanism for model explainability and outlier detection. Our contributions are supported by both theoretical analyses and experimental results across a variety of supervised learning scenarios.} }
Endnote
%0 Conference Paper %T Importance Sampling for Nonlinear Models %A Prakash Palanivelu Rajmohan %A Fred Roosta %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-rajmohan25a %I PMLR %P 51039--51059 %U https://proceedings.mlr.press/v267/rajmohan25a.html %V 267 %X While norm-based and leverage-score-based methods have been extensively studied for identifying "important" data points in linear models, analogous tools for nonlinear models remain significantly underdeveloped. By introducing the concept of the adjoint operator of a nonlinear map, we address this gap and generalize norm-based and leverage-score-based importance sampling to nonlinear settings. We demonstrate that sampling based on these generalized notions of norm and leverage scores provides approximation guarantees for the underlying nonlinear mapping, similar to linear subspace embeddings. As direct applications, these nonlinear scores not only reduce the computational complexity of training nonlinear models by enabling efficient sampling over large datasets but also offer a novel mechanism for model explainability and outlier detection. Our contributions are supported by both theoretical analyses and experimental results across a variety of supervised learning scenarios.
APA
Rajmohan, P.P. & Roosta, F.. (2025). Importance Sampling for Nonlinear Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:51039-51059 Available from https://proceedings.mlr.press/v267/rajmohan25a.html.

Related Material