MMWatermark_Robustness

The official codebase for our paper "Evaluating Durability: Benchmark Insights into Multimodal Watermarking".

Jielin Qiu*, William Jongwon Han*, Xuandong Zhao, Shangbang Long, Christos Faloutsos, Lei Li.

More details can be found on the project webpage.

Citation

If you feel our code or models helps in your research, kindly cite our paper:

@inproceedings{Qiu2024EvaluatingDB,
  title={Evaluating Durability: Benchmark Insights into Multimodal Watermarking},
  author={Jielin Qiu and William Han and Xuandong Zhao and Shangbang Long and Christos Faloutsos and Lei Li},
  journal={arXiv preprint arXiv:2406.03728},
  year={2024}
}

Getting Started

We generally recommend the following pipeline:

Generate text and images utilizing multimodal models.
Watermark generated text and images.
Perturb watermarked text and images.
Detect perturbed, watermarked text and image.

We will now go a bit more in depthon how to do each step.

Environments

In our study, we follow the existing codebases for comprehensive benchmarking.

We recommend creating separate environments for each multimodal model and watermarking method. All perturbations (Text and Image) can be done through one environment.

One thing to note is that some of the links are not in fact repositories but Hugging Face tutorials on how to utilize the models. For such models, we experienced that downloading the latest transformers version works well. However, if there are any errors utilizing multiple multimodal models with a singular environment, please feel free to create another environment.

We provide the link to all of the necessary repositorys for this project. Please carefully follow their environment settings and generate, watermark, perturb in separate environments. We thank all of the repositories as well for open sourcing their code.

Type	Link
Multimodal Model	NExT-GPT
Multimodal Model	RPG
Multimodal Model	LCMs
Multimodal Model	Kandinsky
Multimodal Model	PIXART
Multimodal Model	SDXL-Lightning
Multimodal Model	DALLE3
Multimodal Model	Stable Diffusion
Multimodal Model	Fuyu-8B
Multimodal Model	InternLM-XComposer
Multimodal Model	InstructBLIP
Multimodal Model	LLaVA 1.6
Multimodal Model	MiniGPT-4
Multimodal Model	mPLUG-Owl2
Multimodal Model	Qwen-VL
Watermark	KGW
Watermark	KTH
Watermark	Blackbox
Watermark	Unigram
Watermark	DwtDctSvd
Watermark	RivaGAN
Watermark	SSL
Watermark	Stega Stamp
Image and Text Perturbations	MM_Robustness

COCO Dataset

Please download the COCO validation split from the official website cocodataset. You can download images-val2017 and annotations-val2017.

If for some reason there is a problem with the link, a copy of the data can be found here.

Then move the data into the COCO folder. the coco.py file is the data loader used to iterate through the data.

Multimodal Models and Generation

All multimodal models used in this study is available in the mm_model directory. We do want to note that not all models had a Github repository, however, we still provide an example of how to utilize the model for text or image generation.

Additionally, we want to note that some of the models on Hugging Face are fairly large. We recommend to set the model download cache path to a specific folder on your local machine that has enough memory.

Watermarking

All watermarks are in the watermark directory. After setting up their respective enironments and having already generated the text or images, please proceed to watermark all generated texts or images.

Perturbations

All perturbations are in the perturbation directory. After setting up the perturbation evironment from the perturbations/MM_Robustness repository, please proceed to perturb all of the watermarked images or text. Additionally, inside the perturbation directory, the image_perturb.py and text_perturb.py files contain all of the needed image and text perturbations for this study.

Detection

Due to each watermarking method having their own way of detection, we provide an example pipeline of detecting watermarks. Please view them to see examples of how to detect them. We also provide the calculation of the other metrics as well (e.g., ROUGE, PSNR, etc.).

License

This project is licensed under CC BY-NC-SA License.

Contact

If you have any questions, please contact [email protected], [email protected].

jason-qiu / mmwatermark-robustness Goto Github PK