A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization

Fanglin Li, Bin Wu, Liutong Xu, Chuan Shi, Jing Shi
; Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, PMLR 36:77-87, 2014.

Abstract

The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v36-li14, title = {A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization}, author = {Fanglin Li and Bin Wu and Liutong Xu and Chuan Shi and Jing Shi}, booktitle = {Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications}, pages = {77--87}, year = {2014}, editor = {Wei Fan and Albert Bifet and Qiang Yang and Philip S. Yu}, volume = {36}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {24 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v36/li14.pdf}, url = {http://proceedings.mlr.press/v36/li14.html}, abstract = {The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment.} }
Endnote
%0 Conference Paper %T A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization %A Fanglin Li %A Bin Wu %A Liutong Xu %A Chuan Shi %A Jing Shi %B Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications %C Proceedings of Machine Learning Research %D 2014 %E Wei Fan %E Albert Bifet %E Qiang Yang %E Philip S. Yu %F pmlr-v36-li14 %I PMLR %J Proceedings of Machine Learning Research %P 77--87 %U http://proceedings.mlr.press %V 36 %W PMLR %X The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment.
RIS
TY - CPAPER TI - A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization AU - Fanglin Li AU - Bin Wu AU - Liutong Xu AU - Chuan Shi AU - Jing Shi BT - Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications PY - 2014/08/13 DA - 2014/08/13 ED - Wei Fan ED - Albert Bifet ED - Qiang Yang ED - Philip S. Yu ID - pmlr-v36-li14 PB - PMLR SP - 77 DP - PMLR EP - 87 L1 - http://proceedings.mlr.press/v36/li14.pdf UR - http://proceedings.mlr.press/v36/li14.html AB - The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment. ER -
APA
Li, F., Wu, B., Xu, L., Shi, C. & Shi, J.. (2014). A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization. Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, in PMLR 36:77-87

Related Material