Coder Social home page Coder Social logo

safegen_ccs2024's Introduction

Introduction

This is the official code for "SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models"

๐Ÿ”ฅ SafeGen will appear in ACM Conference on Computer and Communications Security (ACM CCS 2024) Core-A*, CCF-A, Big 4. We will put up the camera-ready paper version once the Camera-Ready Instruction of CCS 2024 is provided.

๐Ÿ“ฃ We have released our pretrained model on Hugging Face. Please check out how to use it for inference ๐Ÿค–.

Our release involves adjusting the self-attention layers of Stable Diffusion alone based on image-only triplets.

This implementation can be regarded as an example that can be integrated into the Diffusers library. Thus, you may navigate to the examples/text_to_image/ folder, and get to know how it works.

Citation

If you find our paper/code/benchmark helpful, please kindly consider citing this work with the following reference:

@inproceedings{li2024safegen,
  author       = {Xinfeng Li and Yuchen Yang and Jiangyi Deng and Chen Yan and Yanjiao Chen and Xiaoyu Ji and Wenyuan Xu},
  title        = {SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models},
  booktitle    = {Proceedings of the 2024 {ACM} {SIGSAC} Conference on Computer and Communications Security, {CCS} 2024},
  year         = {2024},
}

or

@article{li2024safegen,
  title={SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models},
  author={Li, Xinfeng and Yang, Yuchen and Deng, Jiangyi and Yan, Chen and Chen, Yanjiao and Ji, Xiaoyu and Xu, Wenyuan},
  journal={arXiv preprint arXiv:2404.06666},
  year={2024}
}

Environments and Installation

You can run this code using a single A100-40GB (NVIDIA), with our default configuration. In particular, set a small training_batch_size to avoid the out-of-memory error.

we recommend you managing two conda environments to avoid dependencies conflict.

  • A Pytorch environment for adjusting the self-attention layers of the Stable Diffusion model, and evaluation-related libraries.

  • A Tensorflow environment required by the anti-deepnude model for the data preparation stage.

Requirement of PyTorch + Diffusers

# You can install the main dependencies by conda/pip
conda create -n text-agnostic-t2i python=3.8.5
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
# Using the official Diffusers package
pip install --upgrade diffuers[torch]
# Or you may use the community maintained version
conda install -c conda-forge diffusers
...
# Or you can create the env via environment.yaml
conda env create -f environment_pytorch.yaml 

Requirement of Image-only Data Preparation

As the anti-deepnude model requires TensorFlow 1.13, you can install Python<=3.7.

# You can install the dependencies individually
conda create -n anti-deepndue python=3.7
pip install tensorflow-gpu==1.13.1 keras==2.2.4
...
# Or you can create the env via environment.yaml
conda env create -f environment_tf.yaml 

Anti-DeepNude for Data Preparation

You can obtain the <nude, mosaic> pair through:

python anti-deepnude.py

Model Governance Usage

export MODEL_NAME="CompVis/stable-diffusion-v1-4" # export the model you want to protect
export TRAIN_DIR="<path to the manifest of your training files>/meta.csv" # export your image triplets folder
accumulation_steps=5
learning_rate=1e-05
steps=1000
train_batch_size=1
warmup_steps=200
weight_tune=0.1
weight_preserve=0.9
defend_type="nudity_advtrain"
current_time=$(date +'%m%d_%H%M')
output_dir="mos_model_sal_"${defend_type}-"step_"${steps}-"lr_"${learning_rate}-"warm_"${warmup_steps}-${weight_tune}+${weight_preserve}"gacc_"${accumulation_steps}-${current_time}

export CUDA_VISIBLE_DEVICES=0
accelerate launch --mixed_precision="fp16" "--num_processes=1" train_text_to_image_adjust_sal.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --train_data_dir=$TRAIN_DIR \
  --use_ema \
  --resolution=512 --center_crop\
  --train_batch_size ${train_batch_size} \
  --gradient_accumulation_steps ${accumulation_steps} \
  --gradient_checkpointing \
  --max_train_steps ${steps} \
  --learning_rate ${learning_rate} \
  --image_column="nude_image_path" \
  --max_grad_norm=1 \
  --lr_scheduler="constant" \
  --lr_warmup_steps ${warmup_steps} \
  --output_dir ${output_dir} \
  --weight_tune ${weight_tune} \
  --weight_preserve ${weight_preserve} \
  --checkpoints_total_limit 0 \
  --validation_prompts "woman, nudity" "a running dog on grass" "a woman standing aside the colorful table" "a nude sexy man, blue eyes"
  
  # In addition, you can set the 'validation_prompts' to visually monitor the training efficacy, such as "a running dog on grass", "a woman standing aside the colorful table", "a nude sexy man, blue eyes", etc.

Simply running the script

run_adjust_SD.sh

How to use the regulated model?

from diffusers import StableDiffusionPipeline
import torch

model_path = ${output_dir} # the save path of your model
pipeline = StableDiffusionPipeline.from_pretrained(model_path, torch_dtype=torch.float16)
pipeline.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
image = pipeline(prompt).images[0]
image.save("example.png")

Adversarial Textual Prompt Benchmark

Over 50,000 textual adversarial prompts, including self-optimized prompts that appear innocuous, have been developed to test the potential exploitation of T2I models in generating sexually explicit content. Due to the sensitive nature of these images, access is restricted to ensure ethical compliance. Researchers interested in using these images for scholarly purposes must commit to not distributing them further. Please contact me to request access and discuss the necessary safeguards. My email address is: [email protected].

Acknowledgement

This work is based on the amazing research works and open-source projects, thanks a lot to all the authors for sharing!

@misc{von-platen-etal-2022-diffusers,
    author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
    title = {Diffusers: State-of-the-art diffusion models},
    year = {2022},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/huggingface/diffusers}}
}
@inproceedings{parmar2021cleanfid,
  title={On Aliased Resizing and Surprising Subtleties in GAN Evaluation},
  author={Parmar, Gaurav and Zhang, Richard and Zhu, Jun-Yan},
  booktitle={CVPR},
  year={2022}
}
@inproceedings{zhang2018perceptual,
  title={The Unreasonable Effectiveness of Deep Features as a Perceptual Metric},
  author={Zhang, Richard and Isola, Phillip and Efros, Alexei A and Shechtman, Eli and Wang, Oliver},
  booktitle={CVPR},
  year={2018}
}

safegen_ccs2024's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

safegen_ccs2024's Issues

Plan to share pretrained weight for Nudity?

Thank you for your awesome work. I really liked it.

BTW, is there any plan to release the pretrained model weights? It is quite hard to reproduce the results without the pretrained weights.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.