Coder Social home page Coder Social logo

ts_watermark's Introduction

Code for Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models.

Check ArXiv paper here.

Environment Setup

  • Our code follows the setup of KGW.
  • Make sure the packages in requirements.txt are installed in your environment.
  • Also, in the training and inference files, change the model paths (facebook/opt-1.3b, princeton-nlp/sup-simcse-roberta-base, meta-llama/Llama-2-7b-hf) to your desired paths.

Train

  • bash run_pipeline.sh
  • Choose from MOO or Weighted Sum to train the network
    • If MOO, log_z_score=False, z_score_factor=1.0
    • If Weighted Sum, log_z_score=False, z_score_factor=4e-4

Evaluation

Default setting for all evaluation

  • Multinomial sampling with temp=1.0
  • Dataset: C4 realnewslike official validation split from Hugging Face, and we loaded it and split it into our validation and test set. The default split is the test split.
  • 500 samples
  • generation length = 200 tokens

OPT-1.3b evaluation

  • Results stored in eval/opt by default
  • KGW
    • CUDA_VISIBLE_DEVICES=0 python inference_bs.py --split=test --batch_size=20
  • Ours
    • CUDA_VISIBLE_DEVICES=0 python inference.py --split=test --batch_size=20
  • SWEET
    • Evaluation on human-written text CUDA_VISIBLE_DEVICES=0 python inference_sweet.py --split=test --batch_size=20 --human=True
    • Evaluation on watermarked machine-generated text CUDA_VISIBLE_DEVICES=0 python inference_sweet.py --split=test --batch_size=20 --human=False
    • SWEET_no_prompt
      • Applying the same generation algorithm as SWEET, but doesn't need prompts during detection. This is achieved by computing entropy simply over the generated text instead of prompt and generated text.
      • Run experiments by replacing the above inference_sweet.py with inference_sweet_no_prompt.py

Ablation Study: Weighted Sum

  • Ours
    • CUDA_VISIBLE_DEVICES=0 python inference_weighted.py --split=test --batch_size=20

Dipper Attack

  • KGW
    • CUDA_VISIBLE_DEVICES=0 python inference_bs_dipper_get_text.py --split=test --batch_size=20 Get baseline generation text
    • Then use Dipper to get paraphrase text
    • CUDA_VISIBLE_DEVICES=0 python inference_bs_dipper_text_eval.py --split=test --batch_size=20 Evaluation on the paraphase text
  • Ours
    • CUDA_VISIBLE_DEVICES=0 python inference_dipper_get_text.py --split=test --batch_size=20 Get generation text using our watermarking method
    • Then use Dipper to get paraphrase text
    • CUDA_VISIBLE_DEVICES=0 python inference_dipper_text_eval.py --split=test --batch_size=20 Evaluation on the paraphase text

Copy_Paste Attack

  • change num_cp_split=1 or num_cp_split=3 for two settings
  • KGW
    • CUDA_VISIBLE_DEVICES=0 python inference_bs_cp_att.py --split=test --batch_size=20 --num_cp_split=1
  • Ours
    • CUDA_VISIBLE_DEVICES=0 python inference_cp_att.py --split=test --batch_size=20 --num_cp_split=1

LLAMA-2 evaluation

  • Results stored in eval/llama by default
  • KGW
    • CUDA_VISIBLE_DEVICES=0 python inference_bs_llama.py --split=test --batch_size=20
  • Ours
    • CUDA_VISIBLE_DEVICES=0 python inference_llama.py --split=test --batch_size=20

Plotting the figures

Citation

If you're using this work in your research or applications, please cite using this BibTeX:

@article{huo2024token,
  title={Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models},
  author={Huo, Mingjia and Somayajula, Sai Ashish and Liang, Youwei and Zhang, Ruisi and Koushanfar, Farinaz and Xie, Pengtao},
  journal={arXiv preprint arXiv:2402.18059},
  year={2024}
}

ts_watermark's People

Contributors

mignonjia avatar

Stargazers

Erfan Shayegani ๐Ÿ˜ˆ avatar  avatar  avatar

Watchers

 avatar

ts_watermark's Issues

Should add entropy thresholding in SWEET detection

Hi, thanks for the great repo.

Unfortunately, I found a bug in the SWEET implementation.

According to the original SWEET paper and its repository, entropy thresholding is applied during the generation phase and detection phase.

However, in your implementation reproducing the SWEET method, lines for applying entropy threshold are omitted in the detection phase.
(c.f., implementation in original SWEET code is here: https://github.com/hongcheki/sweet-watermark/blob/master/sweet.py#L100)

Is it the code you used for your experiments?

If this code is used in your paper's experiment, and if you also identify it as a bug, please consider fixing it and rerun experiments.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.