Differentiable Product Quantization for End-to-End Embedding Compression

Ting Chen; Lala Li; Yizhou Sun

Differentiable Product Quantization for End-to-End Embedding Compression

Ting Chen, Lala Li, Yizhou Sun

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1617-1626, 2020.

Abstract

Embedding layers are commonly used to map discrete symbols into continuous embedding vectors that reflect their semantic meanings. Despite their effectiveness, the number of parameters in an embedding layer increases linearly with the number of symbols and poses a critical challenge on memory and storage constraints. In this work, we propose a generic and end-to-end learnable compression framework termed differentiable product quantization (DPQ). We present two instantiations of DPQ that leverage different approximation techniques to enable differentiability in end-to-end learning. Our method can readily serve as a drop-in alternative for any existing embedding layer. Empirically, DPQ offers significant compression ratios (14-238X) at negligible or no performance cost on 10 datasets across three different language tasks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v119-chen20l,
  title = 	 {Differentiable Product Quantization for End-to-End Embedding Compression},
  author =       {Chen, Ting and Li, Lala and Sun, Yizhou},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {1617--1626},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/chen20l/chen20l.pdf},
  url = 	 {https://proceedings.mlr.press/v119/chen20l.html},
  abstract = 	 {Embedding layers are commonly used to map discrete symbols into continuous embedding vectors that reflect their semantic meanings. Despite their effectiveness, the number of parameters in an embedding layer increases linearly with the number of symbols and poses a critical challenge on memory and storage constraints. In this work, we propose a generic and end-to-end learnable compression framework termed differentiable product quantization (DPQ). We present two instantiations of DPQ that leverage different approximation techniques to enable differentiability in end-to-end learning. Our method can readily serve as a drop-in alternative for any existing embedding layer. Empirically, DPQ offers significant compression ratios (14-238X) at negligible or no performance cost on 10 datasets across three different language tasks.}
}

Endnote

%0 Conference Paper
%T Differentiable Product Quantization for End-to-End Embedding Compression
%A Ting Chen
%A Lala Li
%A Yizhou Sun
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-chen20l
%I PMLR
%P 1617--1626
%U https://proceedings.mlr.press/v119/chen20l.html
%V 119
%X Embedding layers are commonly used to map discrete symbols into continuous embedding vectors that reflect their semantic meanings. Despite their effectiveness, the number of parameters in an embedding layer increases linearly with the number of symbols and poses a critical challenge on memory and storage constraints. In this work, we propose a generic and end-to-end learnable compression framework termed differentiable product quantization (DPQ). We present two instantiations of DPQ that leverage different approximation techniques to enable differentiability in end-to-end learning. Our method can readily serve as a drop-in alternative for any existing embedding layer. Empirically, DPQ offers significant compression ratios (14-238X) at negligible or no performance cost on 10 datasets across three different language tasks.

APA


Chen, T., Li, L. & Sun, Y.. (2020). Differentiable Product Quantization for End-to-End Embedding Compression. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1617-1626 Available from https://proceedings.mlr.press/v119/chen20l.html.

Differentiable Product Quantization for End-to-End Embedding Compression

Abstract

Cite this Paper

Related Material