Comments (4)
Thanks for highlighting this issue!
The reason it wasn't implemented during the detection phase is due to a common scenario in invisible watermarking for general-purpose generation, where the prompt is not visible during detection. But we will run SWEET experiments when prompts are given during detection and add it into the next version of our manuscript! Does that approach sound good to you?
from ts_watermark.
Thanks for the kind reply and for including a complete version of SWEET in the next version of the manuscript.
We respectfully argue that applying the entropy threshold only in the generation phase but not in the detection phase is not an intuition that SWEET holds. That modified implementation will undoubtedly show a lower 'green token ratio' as the watermark is only partially embedded in the text used in detection.
SWEET's central intuition is that excluding low-entropy tokens by reproducing entropy information in the detection phase will enable better watermark detection since the watermark cannot be embedded in too spiky distribution. Thus, we believe that the current implementation of SWEET in this paper is modified intentionally and the modification should be explicitly mentioned in the paper. It is just an another watermarking method that shares SWEET's idea in generation phase.
We acknowledge that the SWEET method needs the exact prompt to reproduce the entropy completely, and it has limitations in practical settings. (In the updated version of the SWEET paper, authors presented SWEET results without exact prompts, which still shows better performance) Nevertheless, We request full implementation of SWEET to be the baseline if it is to be compared with TS-watermark and KGW.
We are also very much interested in watermark in the general text domain, and how SWEET performs in there. We appreciate your work and for regarding SWEET as a baseline. Thank you.
from ts_watermark.
We just uploaded the revised baselines in this repo, and will update arXiv in a few days. We use SWEET_no_prompt
to represents the baseline where the detection algorithm only use the generated tokens to compute entropy, instead of using both the prompt and the generated tokens.
Check result_figures.ipynb for the new results and inference_sweet.py and inference_sweet_no_prompt.py for the implementation.
Feel free to point out any further concerns!
from ts_watermark.
Thank you for updating SWEET baseline!
Best regards,
from ts_watermark.
Related Issues (3)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ts_watermark.