Can Stochastic Zeroth-Order Frank-Wolfe Method Converge Faster for Non-Convex Problems?

Hongchang Gao; Heng Huang

Can Stochastic Zeroth-Order Frank-Wolfe Method Converge Faster for Non-Convex Problems?

Hongchang Gao, Heng Huang

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:3377-3386, 2020.

Abstract

Frank-Wolfe algorithm is an efficient method for optimizing non-convex constrained problems. However, most of existing methods focus on the first-order case. In real-world applications, the gradient is not always available. To address the problem of lacking gradient in many applications, we propose two new stochastic zeroth-order Frank-Wolfe algorithms and theoretically proved that they have a faster convergence rate than existing methods for non-convex problems. Specifically, the function queries oracle of the proposed faster zeroth-order Frank-Wolfe (FZFW) method is $O(\frac{n^{1/2}d}{\epsilon^2})$ which can match the iteration complexity of the first-order counterpart approximately. As for the proposed faster zeroth-order conditional gradient sliding (FZCGS) method, its function queries oracle is improved to $O(\frac{n^{1/2}d}{\epsilon})$, indicating that its iteration complexity is even better than that of its first-order counterpart NCGS-VR. In other words, the iteration complelxity of the accelerated first-order Frank-Wolfe method NCGS-VR is suboptimal. Then, we proposed a new algorithm to improve its IFO (incremental first-order oracle) to $O(\frac{n^{1/2}}{\epsilon})$. At last, the empirical studies on benchmark datasets validate our theoretical results.

Cite this Paper

BibTeX

@InProceedings{pmlr-v119-gao20b,
  title = 	 {Can Stochastic Zeroth-Order Frank-{W}olfe Method Converge Faster for Non-Convex Problems?},
  author =       {Gao, Hongchang and Huang, Heng},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {3377--3386},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/gao20b/gao20b.pdf},
  url = 	 {https://proceedings.mlr.press/v119/gao20b.html},
  abstract = 	 {Frank-Wolfe algorithm is an efficient method for optimizing non-convex constrained problems. However, most of existing methods focus on the first-order case. In real-world applications, the gradient is not always available. To address the problem of lacking gradient in many applications, we propose two new stochastic zeroth-order Frank-Wolfe algorithms and theoretically proved that they have a faster convergence rate than existing methods for non-convex problems. Specifically, the function queries oracle of the proposed faster zeroth-order Frank-Wolfe (FZFW) method is $O(\frac{n^{1/2}d}{\epsilon^2})$ which can match the iteration complexity of the first-order counterpart approximately. As for the proposed faster zeroth-order conditional gradient sliding (FZCGS) method, its function queries oracle is improved to $O(\frac{n^{1/2}d}{\epsilon})$, indicating that its iteration complexity is even better than that of its first-order counterpart NCGS-VR. In other words, the iteration complelxity of the accelerated first-order Frank-Wolfe method NCGS-VR is suboptimal. Then, we proposed a new algorithm to improve its IFO (incremental first-order oracle) to $O(\frac{n^{1/2}}{\epsilon})$. At last, the empirical studies on benchmark datasets validate our theoretical results.}
}

Endnote

%0 Conference Paper
%T Can Stochastic Zeroth-Order Frank-Wolfe Method Converge Faster for Non-Convex Problems?
%A Hongchang Gao
%A Heng Huang
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-gao20b
%I PMLR
%P 3377--3386
%U https://proceedings.mlr.press/v119/gao20b.html
%V 119
%X Frank-Wolfe algorithm is an efficient method for optimizing non-convex constrained problems. However, most of existing methods focus on the first-order case. In real-world applications, the gradient is not always available. To address the problem of lacking gradient in many applications, we propose two new stochastic zeroth-order Frank-Wolfe algorithms and theoretically proved that they have a faster convergence rate than existing methods for non-convex problems. Specifically, the function queries oracle of the proposed faster zeroth-order Frank-Wolfe (FZFW) method is $O(\frac{n^{1/2}d}{\epsilon^2})$ which can match the iteration complexity of the first-order counterpart approximately. As for the proposed faster zeroth-order conditional gradient sliding (FZCGS) method, its function queries oracle is improved to $O(\frac{n^{1/2}d}{\epsilon})$, indicating that its iteration complexity is even better than that of its first-order counterpart NCGS-VR. In other words, the iteration complelxity of the accelerated first-order Frank-Wolfe method NCGS-VR is suboptimal. Then, we proposed a new algorithm to improve its IFO (incremental first-order oracle) to $O(\frac{n^{1/2}}{\epsilon})$. At last, the empirical studies on benchmark datasets validate our theoretical results.

APA

Gao, H. & Huang, H.. (2020). Can Stochastic Zeroth-Order Frank-Wolfe Method Converge Faster for Non-Convex Problems?. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:3377-3386 Available from https://proceedings.mlr.press/v119/gao20b.html.

Can Stochastic Zeroth-Order Frank-Wolfe Method Converge Faster for Non-Convex Problems?

Abstract

Cite this Paper

Related Material