[edit]
Improved Communication Cost in Distributed PageRank Computation – A Theoretical Study
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:6459-6467, 2020.
Abstract
PageRank is a widely used approach for measuring the importance of a node in a graph. Due to the rapid growth of the graph size in the real world, the importance of computing PageRanks in a distributed environment has been increasingly recognized. However, only a few previous works can provide a provable complexity and accuracy for distributed PageRank computation. Given a constant d≥1 and a graph of n nodes, the state-of-the-art approach, Radar-Push, uses O(loglogn+logd) communication rounds to approximate the PageRanks within a relative error Θ(1logdn) under a generalized congested clique distributed computation model. However, Radar-Push entails as large as O(log2d+3n) bits of bandwidth (e.g., the communication cost between a pair of nodes per round). In this paper, we provide a new algorithm that uses asymptotically the same communication round complexity while using only O(dlog3n) bits of bandwidth.