A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization

Fanglin Li, Bin Wu, Liutong Xu, Chuan Shi, Jing Shi
Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, PMLR 36:77-87, 2014.

Abstract

The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v36-li14, title = {A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization}, author = {Li, Fanglin and Wu, Bin and Xu, Liutong and Shi, Chuan and Shi, Jing}, booktitle = {Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications}, pages = {77--87}, year = {2014}, editor = {Fan, Wei and Bifet, Albert and Yang, Qiang and Yu, Philip S.}, volume = {36}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {24 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v36/li14.pdf}, url = {https://proceedings.mlr.press/v36/li14.html}, abstract = {The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment.} }
Endnote
%0 Conference Paper %T A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization %A Fanglin Li %A Bin Wu %A Liutong Xu %A Chuan Shi %A Jing Shi %B Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications %C Proceedings of Machine Learning Research %D 2014 %E Wei Fan %E Albert Bifet %E Qiang Yang %E Philip S. Yu %F pmlr-v36-li14 %I PMLR %P 77--87 %U https://proceedings.mlr.press/v36/li14.html %V 36 %X The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment.
RIS
TY - CPAPER TI - A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization AU - Fanglin Li AU - Bin Wu AU - Liutong Xu AU - Chuan Shi AU - Jing Shi BT - Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications DA - 2014/08/13 ED - Wei Fan ED - Albert Bifet ED - Qiang Yang ED - Philip S. Yu ID - pmlr-v36-li14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 36 SP - 77 EP - 87 L1 - http://proceedings.mlr.press/v36/li14.pdf UR - https://proceedings.mlr.press/v36/li14.html AB - The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment. ER -
APA
Li, F., Wu, B., Xu, L., Shi, C. & Shi, J.. (2014). A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization. Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, in Proceedings of Machine Learning Research 36:77-87 Available from https://proceedings.mlr.press/v36/li14.html.

Related Material