Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial Time

Sungyoon Kim, Mert Pilanci
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:24458-24485, 2024.

Abstract

In this paper, we study the optimality gap between two-layer ReLU networks regularized with weight decay and their convex relaxations. We show that when the training data is random, the relative optimality gap between the original problem and its relaxation can be bounded by a factor of O(√log n), where n is the number of training samples. A simple application leads to a tractable polynomial-time algorithm that is guaranteed to solve the original non-convex problem up to a logarithmic factor. Moreover, under mild assumptions, we show that local gradient methods converge to a point with low training loss with high probability. Our result is an exponential improvement compared to existing results and sheds new light on understanding why local gradient methods work well.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-kim24ac, title = {Convex Relaxations of {R}e{LU} Neural Networks Approximate Global Optima in Polynomial Time}, author = {Kim, Sungyoon and Pilanci, Mert}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {24458--24485}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/kim24ac/kim24ac.pdf}, url = {https://proceedings.mlr.press/v235/kim24ac.html}, abstract = {In this paper, we study the optimality gap between two-layer ReLU networks regularized with weight decay and their convex relaxations. We show that when the training data is random, the relative optimality gap between the original problem and its relaxation can be bounded by a factor of O(√log n), where n is the number of training samples. A simple application leads to a tractable polynomial-time algorithm that is guaranteed to solve the original non-convex problem up to a logarithmic factor. Moreover, under mild assumptions, we show that local gradient methods converge to a point with low training loss with high probability. Our result is an exponential improvement compared to existing results and sheds new light on understanding why local gradient methods work well.} }
Endnote
%0 Conference Paper %T Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial Time %A Sungyoon Kim %A Mert Pilanci %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-kim24ac %I PMLR %P 24458--24485 %U https://proceedings.mlr.press/v235/kim24ac.html %V 235 %X In this paper, we study the optimality gap between two-layer ReLU networks regularized with weight decay and their convex relaxations. We show that when the training data is random, the relative optimality gap between the original problem and its relaxation can be bounded by a factor of O(√log n), where n is the number of training samples. A simple application leads to a tractable polynomial-time algorithm that is guaranteed to solve the original non-convex problem up to a logarithmic factor. Moreover, under mild assumptions, we show that local gradient methods converge to a point with low training loss with high probability. Our result is an exponential improvement compared to existing results and sheds new light on understanding why local gradient methods work well.
APA
Kim, S. & Pilanci, M.. (2024). Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial Time. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:24458-24485 Available from https://proceedings.mlr.press/v235/kim24ac.html.

Related Material