Paper 2025/256

Inaccessible Entropy for Watermarking Generative Agents

Daniel Alabi, Columbia University
Lav R. Varshney, University of Illinois Urbana-Champaign
Abstract

In this work, we construct distortion-free and unforgeable watermarks for language models and generative agents. The watermarked output cannot be forged by a adversary nor removed by the adversary without significantly degrading model output quality. That is, the watermarked output is distortion-free: the watermarking algorithm does not noticeably change the quality of the model output and without the public detection key, no efficient adversary can distinguish output that is watermarked from outputs which are not. The core of the watermarking schemes involve embedding a message and publicly-verifiable digital signature in the generated model output. The message and signature can be extracted during the detection phase and verified by any authorized entity that has a public key. We show that, assuming the standard cryptographic assumption of one-way functions, we can construct distortion-free and unforgeable watermark schemes. Our framework relies on analyzing the inaccessible entropy of the watermarking schemes based on computational entropy notions derived from the existence of one-way functions.

Metadata
Available format(s)
PDF
Category
Cryptographic protocols
Publication info
Preprint.
Keywords
WatermarkingError-Correcting CodesDigital Signatures
Contact author(s)
alabid @ illinois edu
varshney @ illinois edu
History
2025-02-18: approved
2025-02-17: received
See all versions
Short URL
https://ia.cr/2025/256
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2025/256,
      author = {Daniel Alabi and Lav R. Varshney},
      title = {Inaccessible Entropy for Watermarking Generative Agents},
      howpublished = {Cryptology {ePrint} Archive, Paper 2025/256},
      year = {2025},
      url = {https://eprint.iacr.org/2025/256}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.