Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free

Mingrui Zhang, Lin Chen, Aryan Mokhtari, Hamed Hassani, Amin Karbasi
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3696-3706, 2020.

Abstract

How can we efficiently mitigate the overhead of gradient communications in distributed optimization? This problem is at the heart of training scalable machine learning models and has been mainly studied in the unconstrained setting. In this paper, we propose Quantised Frank-Wolfe (QFW), the first projection free and communication-efficient algorithm for solving constrained optimization problems at scale. We consider both convex and non-convex objective functions, expressed as a finite-sum or more generally a stochastic optimization problem, and provide strong theoretical guarantees on the convergence rate of QFW. This is accomplished by proposing novel quantization schemes that efficiently compress gradients while controlling the noise variance intduced during this process. Finally, we empirically validate the efficiency of QFW in terms of communication and the quality of returned solution against natural baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-zhang20g, title = {Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free}, author = {Zhang, Mingrui and Chen, Lin and Mokhtari, Aryan and Hassani, Hamed and Karbasi, Amin}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {3696--3706}, year = {2020}, editor = {Chiappa, Silvia and Calandra, Roberto}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/zhang20g/zhang20g.pdf}, url = {https://proceedings.mlr.press/v108/zhang20g.html}, abstract = {How can we efficiently mitigate the overhead of gradient communications in distributed optimization? This problem is at the heart of training scalable machine learning models and has been mainly studied in the unconstrained setting. In this paper, we propose Quantised Frank-Wolfe (QFW), the first projection free and communication-efficient algorithm for solving constrained optimization problems at scale. We consider both convex and non-convex objective functions, expressed as a finite-sum or more generally a stochastic optimization problem, and provide strong theoretical guarantees on the convergence rate of QFW. This is accomplished by proposing novel quantization schemes that efficiently compress gradients while controlling the noise variance intduced during this process. Finally, we empirically validate the efficiency of QFW in terms of communication and the quality of returned solution against natural baselines.} }
Endnote
%0 Conference Paper %T Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free %A Mingrui Zhang %A Lin Chen %A Aryan Mokhtari %A Hamed Hassani %A Amin Karbasi %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-zhang20g %I PMLR %P 3696--3706 %U https://proceedings.mlr.press/v108/zhang20g.html %V 108 %X How can we efficiently mitigate the overhead of gradient communications in distributed optimization? This problem is at the heart of training scalable machine learning models and has been mainly studied in the unconstrained setting. In this paper, we propose Quantised Frank-Wolfe (QFW), the first projection free and communication-efficient algorithm for solving constrained optimization problems at scale. We consider both convex and non-convex objective functions, expressed as a finite-sum or more generally a stochastic optimization problem, and provide strong theoretical guarantees on the convergence rate of QFW. This is accomplished by proposing novel quantization schemes that efficiently compress gradients while controlling the noise variance intduced during this process. Finally, we empirically validate the efficiency of QFW in terms of communication and the quality of returned solution against natural baselines.
APA
Zhang, M., Chen, L., Mokhtari, A., Hassani, H. & Karbasi, A.. (2020). Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:3696-3706 Available from https://proceedings.mlr.press/v108/zhang20g.html.

Related Material