Catalyst Acceleration of Error Compensated Methods Leads to Better Communication Complexity
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:615-649, 2023.
Communication overhead is well known to be a key bottleneck in large scale distributed learning, and a particularly successful class of methods which help to overcome this bottleneck is based on the idea of communication compression. Some of the most practically effective gradient compressors, such as TopK, are biased, which causes convergence issues unless one employs a well designed error compensation/feedback mechanism. Error compensation is therefore a fundamental technique in the distributed learning literature. In a recent development, Qian et al (NeurIPS 2021) showed that the error-compensation mechanism can be combined with acceleration/momentum, which is another key and highly successful optimization technique. In particular, they developed the error-compensated loop-less Katyusha (ECLK) method, and proved an accelerated linear rate in the strongly convex case. However, the dependence of their rate on the compressor parameter does not match the best dependence obtainable in the non-accelerated error-compensated methods. Our work addresses this problem. We propose several new accelerated error-compensated methods using the catalyst acceleration technique, and obtain results that match the best dependence on the compressor parameter in non-accelerated error-compensated methods up to logarithmic terms.