Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates

Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:16250-16274, 2024.

Abstract

We study the problem of efficiently computing the derivative of the fixed-point of a parametric nondifferentiable contraction map. This problem has wide applications in machine learning, including hyperparameter optimization, meta-learning and data poisoning attacks. We analyze two popular approaches: iterative differentiation (ITD) and approximate implicit differentiation (AID). A key challenge behind the nonsmooth setting is that the chain rule does not hold anymore. We build upon the work by Bolte et al. (2022), who prove linear convergence of nonsmooth ITD under a piecewise Lipschitz smooth assumption. In the deterministic case, we provide a linear rate for AID and an improved linear rate for ITD which closely match the ones for the smooth setting. We further introduce NSID, a new stochastic method to compute the implicit derivative when the contraction map is defined as the composition of an outer map and an inner map which is accessible only through a stochastic unbiased estimator. We establish rates for the convergence of NSID, encompassing the best available rates in the smooth setting. We also present illustrative experiments confirming our analysis.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-grazzi24a, title = {Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates}, author = {Grazzi, Riccardo and Pontil, Massimiliano and Salzo, Saverio}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {16250--16274}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/grazzi24a/grazzi24a.pdf}, url = {https://proceedings.mlr.press/v235/grazzi24a.html}, abstract = {We study the problem of efficiently computing the derivative of the fixed-point of a parametric nondifferentiable contraction map. This problem has wide applications in machine learning, including hyperparameter optimization, meta-learning and data poisoning attacks. We analyze two popular approaches: iterative differentiation (ITD) and approximate implicit differentiation (AID). A key challenge behind the nonsmooth setting is that the chain rule does not hold anymore. We build upon the work by Bolte et al. (2022), who prove linear convergence of nonsmooth ITD under a piecewise Lipschitz smooth assumption. In the deterministic case, we provide a linear rate for AID and an improved linear rate for ITD which closely match the ones for the smooth setting. We further introduce NSID, a new stochastic method to compute the implicit derivative when the contraction map is defined as the composition of an outer map and an inner map which is accessible only through a stochastic unbiased estimator. We establish rates for the convergence of NSID, encompassing the best available rates in the smooth setting. We also present illustrative experiments confirming our analysis.} }
Endnote
%0 Conference Paper %T Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates %A Riccardo Grazzi %A Massimiliano Pontil %A Saverio Salzo %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-grazzi24a %I PMLR %P 16250--16274 %U https://proceedings.mlr.press/v235/grazzi24a.html %V 235 %X We study the problem of efficiently computing the derivative of the fixed-point of a parametric nondifferentiable contraction map. This problem has wide applications in machine learning, including hyperparameter optimization, meta-learning and data poisoning attacks. We analyze two popular approaches: iterative differentiation (ITD) and approximate implicit differentiation (AID). A key challenge behind the nonsmooth setting is that the chain rule does not hold anymore. We build upon the work by Bolte et al. (2022), who prove linear convergence of nonsmooth ITD under a piecewise Lipschitz smooth assumption. In the deterministic case, we provide a linear rate for AID and an improved linear rate for ITD which closely match the ones for the smooth setting. We further introduce NSID, a new stochastic method to compute the implicit derivative when the contraction map is defined as the composition of an outer map and an inner map which is accessible only through a stochastic unbiased estimator. We establish rates for the convergence of NSID, encompassing the best available rates in the smooth setting. We also present illustrative experiments confirming our analysis.
APA
Grazzi, R., Pontil, M. & Salzo, S.. (2024). Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:16250-16274 Available from https://proceedings.mlr.press/v235/grazzi24a.html.

Related Material