Simple Probability Truncation Improves Soft Red List Watermarks

Henrique Da Silva Gameiro, Andrei Kucharavy
Proceedings of the Fourth Swiss AI Days, PMLR 309:82-99, 2026.

Abstract

Watermarking, whereby LLM outputs are steered to encode an easily identifiable digital signature, has recently gained attention as a potential solution for detecting synthetically generated text. However, watermarking schemes require tradeoffs between detectability (i.e., how easily the watermark can be identified by an algorithm) and quality of the generated text (i.e., the stylistic and semantic disruption to the normal generation of the LLM). In this work, we propose a simple extension to the Soft Red List watermark, Softer Red List, which enables higher detectability while maintaining text quality on par with non-watermarked text. Specifically, Softer Red List improves the classical red/green token algorithm by adding a probability truncation filter before boosting the probabilities tokens in the green list. Despite its simplicity, Softer Red List matches or exceeds the performance of previously published LLM watermarking schemes, notably achieving a better detection rate at a low false positive rate (FPR) than SynthID in the disinformation detection setting, all while maintaining comparable perplexity and better reasoning capacities.

Cite this Paper


BibTeX
@InProceedings{pmlr-v309-da-silva-gameiro26a, title = {Simple Probability Truncation Improves Soft Red List Watermarks}, author = {Da Silva Gameiro, Henrique and Kucharavy, Andrei}, booktitle = {Proceedings of the Fourth Swiss AI Days}, pages = {82--99}, year = {2026}, editor = {Kucharavy, Andrei and Delgado, Pamela and Schürch Todeschini, Valérie and Rumley, Sébastien}, volume = {309}, series = {Proceedings of Machine Learning Research}, month = {23--25 Mar}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v309/main/assets/da-silva-gameiro26a/da-silva-gameiro26a.pdf}, url = {https://proceedings.mlr.press/v309/da-silva-gameiro26a.html}, abstract = {Watermarking, whereby LLM outputs are steered to encode an easily identifiable digital signature, has recently gained attention as a potential solution for detecting synthetically generated text. However, watermarking schemes require tradeoffs between detectability (i.e., how easily the watermark can be identified by an algorithm) and quality of the generated text (i.e., the stylistic and semantic disruption to the normal generation of the LLM). In this work, we propose a simple extension to the Soft Red List watermark, Softer Red List, which enables higher detectability while maintaining text quality on par with non-watermarked text. Specifically, Softer Red List improves the classical red/green token algorithm by adding a probability truncation filter before boosting the probabilities tokens in the green list. Despite its simplicity, Softer Red List matches or exceeds the performance of previously published LLM watermarking schemes, notably achieving a better detection rate at a low false positive rate (FPR) than SynthID in the disinformation detection setting, all while maintaining comparable perplexity and better reasoning capacities.} }
Endnote
%0 Conference Paper %T Simple Probability Truncation Improves Soft Red List Watermarks %A Henrique Da Silva Gameiro %A Andrei Kucharavy %B Proceedings of the Fourth Swiss AI Days %C Proceedings of Machine Learning Research %D 2026 %E Andrei Kucharavy %E Pamela Delgado %E Valérie Schürch Todeschini %E Sébastien Rumley %F pmlr-v309-da-silva-gameiro26a %I PMLR %P 82--99 %U https://proceedings.mlr.press/v309/da-silva-gameiro26a.html %V 309 %X Watermarking, whereby LLM outputs are steered to encode an easily identifiable digital signature, has recently gained attention as a potential solution for detecting synthetically generated text. However, watermarking schemes require tradeoffs between detectability (i.e., how easily the watermark can be identified by an algorithm) and quality of the generated text (i.e., the stylistic and semantic disruption to the normal generation of the LLM). In this work, we propose a simple extension to the Soft Red List watermark, Softer Red List, which enables higher detectability while maintaining text quality on par with non-watermarked text. Specifically, Softer Red List improves the classical red/green token algorithm by adding a probability truncation filter before boosting the probabilities tokens in the green list. Despite its simplicity, Softer Red List matches or exceeds the performance of previously published LLM watermarking schemes, notably achieving a better detection rate at a low false positive rate (FPR) than SynthID in the disinformation detection setting, all while maintaining comparable perplexity and better reasoning capacities.
APA
Da Silva Gameiro, H. & Kucharavy, A.. (2026). Simple Probability Truncation Improves Soft Red List Watermarks. Proceedings of the Fourth Swiss AI Days, in Proceedings of Machine Learning Research 309:82-99 Available from https://proceedings.mlr.press/v309/da-silva-gameiro26a.html.

Related Material