Kernel Distributionally Robust Optimization: Generalized Duality Theorem and Stochastic Approximation

Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:280-288, 2021.

Abstract

We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity sets, which can be generalized to sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing robust and stochastic optimization methods. We prove a theorem that generalizes the classical duality in the mathematical problem of moments. Enabled by this theorem, we reformulate the maximization with respect to measures in DRO into the dual program that searches for RKHS functions. Using universal RKHSs, the theorem applies to a broad class of loss functions, lifting common limitations such as polynomial losses and knowledge of the Lipschitz constant. We then establish a connection between DRO and stochastic optimization with expectation constraints. Finally, we propose practical algorithms based on both batch convex solvers and stochastic functional gradient, which apply to general optimization and machine learning tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-zhu21a, title = { Kernel Distributionally Robust Optimization: Generalized Duality Theorem and Stochastic Approximation }, author = {Zhu, Jia-Jie and Jitkrittum, Wittawat and Diehl, Moritz and Sch{\"o}lkopf, Bernhard}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {280--288}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/zhu21a/zhu21a.pdf}, url = {https://proceedings.mlr.press/v130/zhu21a.html}, abstract = { We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity sets, which can be generalized to sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing robust and stochastic optimization methods. We prove a theorem that generalizes the classical duality in the mathematical problem of moments. Enabled by this theorem, we reformulate the maximization with respect to measures in DRO into the dual program that searches for RKHS functions. Using universal RKHSs, the theorem applies to a broad class of loss functions, lifting common limitations such as polynomial losses and knowledge of the Lipschitz constant. We then establish a connection between DRO and stochastic optimization with expectation constraints. Finally, we propose practical algorithms based on both batch convex solvers and stochastic functional gradient, which apply to general optimization and machine learning tasks. } }
Endnote
%0 Conference Paper %T Kernel Distributionally Robust Optimization: Generalized Duality Theorem and Stochastic Approximation %A Jia-Jie Zhu %A Wittawat Jitkrittum %A Moritz Diehl %A Bernhard Schölkopf %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-zhu21a %I PMLR %P 280--288 %U https://proceedings.mlr.press/v130/zhu21a.html %V 130 %X We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity sets, which can be generalized to sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing robust and stochastic optimization methods. We prove a theorem that generalizes the classical duality in the mathematical problem of moments. Enabled by this theorem, we reformulate the maximization with respect to measures in DRO into the dual program that searches for RKHS functions. Using universal RKHSs, the theorem applies to a broad class of loss functions, lifting common limitations such as polynomial losses and knowledge of the Lipschitz constant. We then establish a connection between DRO and stochastic optimization with expectation constraints. Finally, we propose practical algorithms based on both batch convex solvers and stochastic functional gradient, which apply to general optimization and machine learning tasks.
APA
Zhu, J., Jitkrittum, W., Diehl, M. & Schölkopf, B.. (2021). Kernel Distributionally Robust Optimization: Generalized Duality Theorem and Stochastic Approximation . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:280-288 Available from https://proceedings.mlr.press/v130/zhu21a.html.

Related Material