[edit]
Simple Probability Truncation Improves Soft Red List Watermarks
Proceedings of the Fourth Swiss AI Days, PMLR 309:82-99, 2026.
Abstract
Watermarking, whereby LLM outputs are steered to encode an easily identifiable digital signature, has recently gained attention as a potential solution for detecting synthetically generated text. However, watermarking schemes require tradeoffs between detectability (i.e., how easily the watermark can be identified by an algorithm) and quality of the generated text (i.e., the stylistic and semantic disruption to the normal generation of the LLM). In this work, we propose a simple extension to the Soft Red List watermark, Softer Red List, which enables higher detectability while maintaining text quality on par with non-watermarked text. Specifically, Softer Red List improves the classical red/green token algorithm by adding a probability truncation filter before boosting the probabilities tokens in the green list. Despite its simplicity, Softer Red List matches or exceeds the performance of previously published LLM watermarking schemes, notably achieving a better detection rate at a low false positive rate (FPR) than SynthID in the disinformation detection setting, all while maintaining comparable perplexity and better reasoning capacities.