Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion

Anle Ke, Xu Zhang, Tong Chen, Ming Lu, Chao Zhou, Jiawen Gu, Zhan Ma
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:29626-29650, 2025.

Abstract

Existing multimodal large model-based image compression frameworks often rely on a fragmented integration of semantic retrieval, latent compression, and generative models, resulting in suboptimal performance in both reconstruction fidelity and coding efficiency. To address these challenges, we propose a residual-guided ultra lowrate image compression named ResULIC, which incorporates residual signals into both semantic retrieval and the diffusion-based generation process. Specifically, we introduce Semantic Residual Coding (SRC) to capture the semantic disparity between the original image and its compressed latent representation. A perceptual fidelity optimizer is further applied for superior reconstruction quality. Additionally, we present the Compression-aware Diffusion Model (CDM), which establishes an optimal alignment between bitrates and diffusion time steps, improving compression-reconstruction synergy. Extensive experiments demonstrate the effectiveness of ResULIC, achieving superior objective and subjective performance compared to state-of-the-art diffusion-based methods with -80.7%, -66.3% BD-rate saving in terms of LPIPS and FID.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-ke25c, title = {Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion}, author = {Ke, Anle and Zhang, Xu and Chen, Tong and Lu, Ming and Zhou, Chao and Gu, Jiawen and Ma, Zhan}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {29626--29650}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/ke25c/ke25c.pdf}, url = {https://proceedings.mlr.press/v267/ke25c.html}, abstract = {Existing multimodal large model-based image compression frameworks often rely on a fragmented integration of semantic retrieval, latent compression, and generative models, resulting in suboptimal performance in both reconstruction fidelity and coding efficiency. To address these challenges, we propose a residual-guided ultra lowrate image compression named ResULIC, which incorporates residual signals into both semantic retrieval and the diffusion-based generation process. Specifically, we introduce Semantic Residual Coding (SRC) to capture the semantic disparity between the original image and its compressed latent representation. A perceptual fidelity optimizer is further applied for superior reconstruction quality. Additionally, we present the Compression-aware Diffusion Model (CDM), which establishes an optimal alignment between bitrates and diffusion time steps, improving compression-reconstruction synergy. Extensive experiments demonstrate the effectiveness of ResULIC, achieving superior objective and subjective performance compared to state-of-the-art diffusion-based methods with -80.7%, -66.3% BD-rate saving in terms of LPIPS and FID.} }
Endnote
%0 Conference Paper %T Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion %A Anle Ke %A Xu Zhang %A Tong Chen %A Ming Lu %A Chao Zhou %A Jiawen Gu %A Zhan Ma %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-ke25c %I PMLR %P 29626--29650 %U https://proceedings.mlr.press/v267/ke25c.html %V 267 %X Existing multimodal large model-based image compression frameworks often rely on a fragmented integration of semantic retrieval, latent compression, and generative models, resulting in suboptimal performance in both reconstruction fidelity and coding efficiency. To address these challenges, we propose a residual-guided ultra lowrate image compression named ResULIC, which incorporates residual signals into both semantic retrieval and the diffusion-based generation process. Specifically, we introduce Semantic Residual Coding (SRC) to capture the semantic disparity between the original image and its compressed latent representation. A perceptual fidelity optimizer is further applied for superior reconstruction quality. Additionally, we present the Compression-aware Diffusion Model (CDM), which establishes an optimal alignment between bitrates and diffusion time steps, improving compression-reconstruction synergy. Extensive experiments demonstrate the effectiveness of ResULIC, achieving superior objective and subjective performance compared to state-of-the-art diffusion-based methods with -80.7%, -66.3% BD-rate saving in terms of LPIPS and FID.
APA
Ke, A., Zhang, X., Chen, T., Lu, M., Zhou, C., Gu, J. & Ma, Z.. (2025). Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:29626-29650 Available from https://proceedings.mlr.press/v267/ke25c.html.

Related Material