Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data

Shuai Zheng, James Tin-Yau Kwok
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5932-5940, 2018.

Abstract

Variance reduction has been commonly used in stochastic optimization. It relies crucially on the assumption that the data set is finite. However, when the data are imputed with random noise as in data augmentation, the perturbed data set becomes essentially infinite. Recently, the stochastic MISO (S-MISO) algorithm is introduced to address this expected risk minimization problem. Though it converges faster than SGD, a significant amount of memory is required. In this paper, we propose two SGD-like algorithms for expected risk minimization with random perturbation, namely, stochastic sample average gradient (SSAG) and stochastic SAGA (S-SAGA). The memory cost of SSAG does not depend on the sample size, while that of S-SAGA is the same as those of variance reduction methods on unperturbed data. Theoretical analysis and experimental results on logistic regression and AUC maximization show that SSAG has faster convergence rate than SGD with comparable space requirement while S-SAGA outperforms S-MISO in terms of both iteration complexity and storage.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-zheng18a, title = {Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data}, author = {Zheng, Shuai and Kwok, James Tin-Yau}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {5932--5940}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/zheng18a/zheng18a.pdf}, url = {http://proceedings.mlr.press/v80/zheng18a.html}, abstract = {Variance reduction has been commonly used in stochastic optimization. It relies crucially on the assumption that the data set is finite. However, when the data are imputed with random noise as in data augmentation, the perturbed data set becomes essentially infinite. Recently, the stochastic MISO (S-MISO) algorithm is introduced to address this expected risk minimization problem. Though it converges faster than SGD, a significant amount of memory is required. In this paper, we propose two SGD-like algorithms for expected risk minimization with random perturbation, namely, stochastic sample average gradient (SSAG) and stochastic SAGA (S-SAGA). The memory cost of SSAG does not depend on the sample size, while that of S-SAGA is the same as those of variance reduction methods on unperturbed data. Theoretical analysis and experimental results on logistic regression and AUC maximization show that SSAG has faster convergence rate than SGD with comparable space requirement while S-SAGA outperforms S-MISO in terms of both iteration complexity and storage.} }
Endnote
%0 Conference Paper %T Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data %A Shuai Zheng %A James Tin-Yau Kwok %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-zheng18a %I PMLR %P 5932--5940 %U http://proceedings.mlr.press/v80/zheng18a.html %V 80 %X Variance reduction has been commonly used in stochastic optimization. It relies crucially on the assumption that the data set is finite. However, when the data are imputed with random noise as in data augmentation, the perturbed data set becomes essentially infinite. Recently, the stochastic MISO (S-MISO) algorithm is introduced to address this expected risk minimization problem. Though it converges faster than SGD, a significant amount of memory is required. In this paper, we propose two SGD-like algorithms for expected risk minimization with random perturbation, namely, stochastic sample average gradient (SSAG) and stochastic SAGA (S-SAGA). The memory cost of SSAG does not depend on the sample size, while that of S-SAGA is the same as those of variance reduction methods on unperturbed data. Theoretical analysis and experimental results on logistic regression and AUC maximization show that SSAG has faster convergence rate than SGD with comparable space requirement while S-SAGA outperforms S-MISO in terms of both iteration complexity and storage.
APA
Zheng, S. & Kwok, J.T.. (2018). Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5932-5940 Available from http://proceedings.mlr.press/v80/zheng18a.html.

Related Material