A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization
Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, PMLR 36:77-87, 2014.
The accuracy and effectiveness of matrix factorization technique were well demonstrated in the Netflix movie recommendation contest. Among the numerous solutions for matrix factorization, Stochastic Gradient Descent (SGD) is one of the most widely used algorithms. However, as a sequential approach, SGD algorithm cannot directly be used in the Distributed Cluster Environment (DCE). In this paper, we propose a fast distributed SGD algorithm named FDSGD for matrix factorization, which can run efficiently in DCE. This algorithm solves data sharing problem based on independent storage system to avoid data synchronization which may cause a big influence to algorithm performance, and synchronous operation problem in DCE using a distributed synchronization tool so that distributed cooperation threads can perform in a harmonious environment.