Differentiating the Value Function by using Convex Duality

Sheheryar Mehmood, Peter Ochs
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3871-3879, 2021.

Abstract

We consider the differentiation of the value function for parametric optimization problems. Such problems are ubiquitous in machine learning applications such as structured support vector machines, matrix factorization and min-min or minimax problems in general. Existing approaches for computing the derivative rely on strong assumptions of the parametric function. Therefore, in several scenarios there is no theoretical evidence that a given algorithmic differentiation strategy computes the true gradient information of the value function. We leverage a well known result from convex duality theory to relax the conditions and to derive convergence rates of the derivative approximation for several classes of parametric optimization problems in Machine Learning. We demonstrate the versatility of our approach in several experiments, including non-smooth parametric functions. Even in settings where other approaches are applicable, our duality based strategy shows a favorable performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-mehmood21a, title = { Differentiating the Value Function by using Convex Duality }, author = {Mehmood, Sheheryar and Ochs, Peter}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {3871--3879}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/mehmood21a/mehmood21a.pdf}, url = {https://proceedings.mlr.press/v130/mehmood21a.html}, abstract = { We consider the differentiation of the value function for parametric optimization problems. Such problems are ubiquitous in machine learning applications such as structured support vector machines, matrix factorization and min-min or minimax problems in general. Existing approaches for computing the derivative rely on strong assumptions of the parametric function. Therefore, in several scenarios there is no theoretical evidence that a given algorithmic differentiation strategy computes the true gradient information of the value function. We leverage a well known result from convex duality theory to relax the conditions and to derive convergence rates of the derivative approximation for several classes of parametric optimization problems in Machine Learning. We demonstrate the versatility of our approach in several experiments, including non-smooth parametric functions. Even in settings where other approaches are applicable, our duality based strategy shows a favorable performance. } }
Endnote
%0 Conference Paper %T Differentiating the Value Function by using Convex Duality %A Sheheryar Mehmood %A Peter Ochs %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-mehmood21a %I PMLR %P 3871--3879 %U https://proceedings.mlr.press/v130/mehmood21a.html %V 130 %X We consider the differentiation of the value function for parametric optimization problems. Such problems are ubiquitous in machine learning applications such as structured support vector machines, matrix factorization and min-min or minimax problems in general. Existing approaches for computing the derivative rely on strong assumptions of the parametric function. Therefore, in several scenarios there is no theoretical evidence that a given algorithmic differentiation strategy computes the true gradient information of the value function. We leverage a well known result from convex duality theory to relax the conditions and to derive convergence rates of the derivative approximation for several classes of parametric optimization problems in Machine Learning. We demonstrate the versatility of our approach in several experiments, including non-smooth parametric functions. Even in settings where other approaches are applicable, our duality based strategy shows a favorable performance.
APA
Mehmood, S. & Ochs, P.. (2021). Differentiating the Value Function by using Convex Duality . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:3871-3879 Available from https://proceedings.mlr.press/v130/mehmood21a.html.

Related Material