Hogwild! over Distributed Local Data Sets with Linearly Increasing Mini-Batch Sizes

Nhuong Nguyen; Toan Nguyen; PHUONG HA NGUYEN; Quoc Tran-Dinh; Lam Nguyen; Marten van Dijk

Hogwild! over Distributed Local Data Sets with Linearly Increasing Mini-Batch Sizes

Nhuong Nguyen, Toan Nguyen, PHUONG HA NGUYEN, Quoc Tran-Dinh, Lam Nguyen, Marten van Dijk

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:1207-1215, 2021.

Abstract

Hogwild! implements asynchronous Stochastic Gradient Descent (SGD) where multiple threads in parallel access a common repository containing training data, perform SGD iterations and update shared state that represents a jointly learned (global) model. We consider big data analysis where training data is distributed among local data sets in a heterogeneous way – and we wish to move SGD computations to local compute nodes where local data resides. The results of these local SGD computations are aggregated by a central “aggregator” which mimics Hogwild!. We show how local compute nodes can start choosing small mini-batch sizes which increase to larger ones in order to reduce communication cost (round interaction with the aggregator). We improve state-of-the-art literature and show O(K^{0.5}) communication rounds for heterogeneous data for strongly convex problems, where K is the total number of gradient computations across all local compute nodes. For our scheme, we prove a tight and novel non-trivial convergence analysis for strongly convex problems for heterogeneous data which does not use the bounded gradient assumption as seen in many existing publications. The tightness is a consequence of our proofs for lower and upper bounds of the convergence rate, which show a constant factor difference. We show experimental results for plain convex and non-convex problems for biased (i.e., heterogeneous) and unbiased local data sets.

Cite this Paper

BibTeX


@InProceedings{pmlr-v130-nguyen21a,
  title = 	 { Hogwild! over Distributed Local Data Sets with Linearly Increasing Mini-Batch Sizes },
  author =       {Nguyen, Nhuong and Nguyen, Toan and HA NGUYEN, PHUONG and Tran-Dinh, Quoc and Nguyen, Lam and van Dijk, Marten},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1207--1215},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/nguyen21a/nguyen21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/nguyen21a.html},
  abstract = 	 { Hogwild! implements asynchronous Stochastic Gradient Descent (SGD) where multiple threads in parallel access a common repository containing training data, perform SGD iterations and update shared state that represents a jointly learned (global) model. We consider big data analysis where training data is distributed among local data sets in a heterogeneous way – and we wish to move SGD computations to local compute nodes where local data resides. The results of these local SGD computations are aggregated by a central “aggregator” which mimics Hogwild!. We show how local compute nodes can start choosing small mini-batch sizes which increase to larger ones in order to reduce communication cost (round interaction with the aggregator). We improve state-of-the-art literature and show O(K^{0.5}) communication rounds for heterogeneous data for strongly convex problems, where K is the total number of gradient computations across all local compute nodes. For our scheme, we prove a tight and novel non-trivial convergence analysis for strongly convex problems for heterogeneous data which does not use the bounded gradient assumption as seen in many existing publications. The tightness is a consequence of our proofs for lower and upper bounds of the convergence rate, which show a constant factor difference. We show experimental results for plain convex and non-convex problems for biased (i.e., heterogeneous) and unbiased local data sets. }
}

Endnote

%0 Conference Paper
%T  Hogwild! over Distributed Local Data Sets with Linearly Increasing Mini-Batch Sizes 
%A Nhuong Nguyen
%A Toan Nguyen
%A PHUONG HA NGUYEN
%A Quoc Tran-Dinh
%A Lam Nguyen
%A Marten van Dijk
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-nguyen21a
%I PMLR
%P 1207--1215
%U https://proceedings.mlr.press/v130/nguyen21a.html
%V 130
%X  Hogwild! implements asynchronous Stochastic Gradient Descent (SGD) where multiple threads in parallel access a common repository containing training data, perform SGD iterations and update shared state that represents a jointly learned (global) model. We consider big data analysis where training data is distributed among local data sets in a heterogeneous way – and we wish to move SGD computations to local compute nodes where local data resides. The results of these local SGD computations are aggregated by a central “aggregator” which mimics Hogwild!. We show how local compute nodes can start choosing small mini-batch sizes which increase to larger ones in order to reduce communication cost (round interaction with the aggregator). We improve state-of-the-art literature and show O(K^{0.5}) communication rounds for heterogeneous data for strongly convex problems, where K is the total number of gradient computations across all local compute nodes. For our scheme, we prove a tight and novel non-trivial convergence analysis for strongly convex problems for heterogeneous data which does not use the bounded gradient assumption as seen in many existing publications. The tightness is a consequence of our proofs for lower and upper bounds of the convergence rate, which show a constant factor difference. We show experimental results for plain convex and non-convex problems for biased (i.e., heterogeneous) and unbiased local data sets.

APA


Nguyen, N., Nguyen, T., HA NGUYEN, P., Tran-Dinh, Q., Nguyen, L. & van Dijk, M.. (2021).  Hogwild! over Distributed Local Data Sets with Linearly Increasing Mini-Batch Sizes . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:1207-1215 Available from https://proceedings.mlr.press/v130/nguyen21a.html.

Hogwild! over Distributed Local Data Sets with Linearly Increasing Mini-Batch Sizes

Abstract

Cite this Paper

Related Material