Robust Blind Watermarking Framework for Hybrid Networks Combining CNN and Transformer

Baowei Wang, Ziwei Song, Yufeng Wu
Proceedings of the 15th Asian Conference on Machine Learning, PMLR 222:1417-1432, 2024.

Abstract

As an essential means of copyright protection, the deep learning-based robust watermarking method is being studied extensively. Its framework consists of three main parts: the encoder, the noise layer and the decoder. But practically all of the schemes are directed at the encoder rather than the decoder. And the whole network is structured by shallow Convolutional Neural Networks (CNNs) for primary feature extraction, while CNNs capture local information and do not model non-local information in watermarked images well. To solve this problem, we consider the use of Transformer networks with a spatially self-attention mechanism. We propose to construct a novel decoder network by combining Transformer and CNNs, which can not only enriches local feature information but also enhances the ability to explore global representations. Meanwhile, to embed secret messages more perfectly, we design a multi-scale attentional feature fusion module to achieve an efficient aggregation of cover image features and secret message features, resulting in the encoded images with rich hybrid features. In addition, perceptual loss is introduced to better evaluate the visual quality of the watermarked images. Extensive experimental results show that our proposed method achieves better results in terms of imperceptibility and robustness compared with existing State-Of-The-Art (SOTA) methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v222-wang24a, title = {Robust Blind Watermarking Framework for Hybrid Networks Combining {CNN} and Transformer}, author = {Wang, Baowei and Song, Ziwei and Wu, Yufeng}, booktitle = {Proceedings of the 15th Asian Conference on Machine Learning}, pages = {1417--1432}, year = {2024}, editor = {Yanıkoğlu, Berrin and Buntine, Wray}, volume = {222}, series = {Proceedings of Machine Learning Research}, month = {11--14 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v222/wang24a/wang24a.pdf}, url = {https://proceedings.mlr.press/v222/wang24a.html}, abstract = {As an essential means of copyright protection, the deep learning-based robust watermarking method is being studied extensively. Its framework consists of three main parts: the encoder, the noise layer and the decoder. But practically all of the schemes are directed at the encoder rather than the decoder. And the whole network is structured by shallow Convolutional Neural Networks (CNNs) for primary feature extraction, while CNNs capture local information and do not model non-local information in watermarked images well. To solve this problem, we consider the use of Transformer networks with a spatially self-attention mechanism. We propose to construct a novel decoder network by combining Transformer and CNNs, which can not only enriches local feature information but also enhances the ability to explore global representations. Meanwhile, to embed secret messages more perfectly, we design a multi-scale attentional feature fusion module to achieve an efficient aggregation of cover image features and secret message features, resulting in the encoded images with rich hybrid features. In addition, perceptual loss is introduced to better evaluate the visual quality of the watermarked images. Extensive experimental results show that our proposed method achieves better results in terms of imperceptibility and robustness compared with existing State-Of-The-Art (SOTA) methods.} }
Endnote
%0 Conference Paper %T Robust Blind Watermarking Framework for Hybrid Networks Combining CNN and Transformer %A Baowei Wang %A Ziwei Song %A Yufeng Wu %B Proceedings of the 15th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Berrin Yanıkoğlu %E Wray Buntine %F pmlr-v222-wang24a %I PMLR %P 1417--1432 %U https://proceedings.mlr.press/v222/wang24a.html %V 222 %X As an essential means of copyright protection, the deep learning-based robust watermarking method is being studied extensively. Its framework consists of three main parts: the encoder, the noise layer and the decoder. But practically all of the schemes are directed at the encoder rather than the decoder. And the whole network is structured by shallow Convolutional Neural Networks (CNNs) for primary feature extraction, while CNNs capture local information and do not model non-local information in watermarked images well. To solve this problem, we consider the use of Transformer networks with a spatially self-attention mechanism. We propose to construct a novel decoder network by combining Transformer and CNNs, which can not only enriches local feature information but also enhances the ability to explore global representations. Meanwhile, to embed secret messages more perfectly, we design a multi-scale attentional feature fusion module to achieve an efficient aggregation of cover image features and secret message features, resulting in the encoded images with rich hybrid features. In addition, perceptual loss is introduced to better evaluate the visual quality of the watermarked images. Extensive experimental results show that our proposed method achieves better results in terms of imperceptibility and robustness compared with existing State-Of-The-Art (SOTA) methods.
APA
Wang, B., Song, Z. & Wu, Y.. (2024). Robust Blind Watermarking Framework for Hybrid Networks Combining CNN and Transformer. Proceedings of the 15th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 222:1417-1432 Available from https://proceedings.mlr.press/v222/wang24a.html.

Related Material