Coder Social home page Coder Social logo

hcp-diffusion's Introduction

HCP-Diffusion

PyPI GitHub stars GitHub license codecov open issues

πŸ“˜δΈ­ζ–‡θ―΄ζ˜Ž

πŸ“˜English document πŸ“˜δΈ­ζ–‡ζ–‡ζ‘£

Introduction

HCP-Diffusion is a toolbox for Stable Diffusion models based on πŸ€— Diffusers. It facilitates flexiable configurations and component support for training, in comparison with webui and sd-scripts.

This toolbox supports Colossal-AI, which can significantly reduce GPU memory usage.

HCP-Diffusion can unify existing training methods for text-to-image generation (e.g., Prompt-tuning, Textual Inversion, DreamArtist, Fine-tuning, DreamBooth, LoRA, ControlNet, etc) and model structures through a single .yaml configuration file.

The toolbox has also implemented an upgraded version of DreamArtist with LoRA, named DreamArtist++, for one-shot text-to-image generation. Compared to DreamArtist, DreamArtist++ is more stable with higher image quality and generation controllability, and faster training speed.

Features

  • Layer-wise LoRA (with Conv2d)
  • Layer-wise fine-tuning
  • Layer-wise model ensemble
  • Prompt-tuning with multiple words
  • DreamArtist and DreamArtist++
  • Aspect Ratio Bucket (ARB) with automatic clustering
  • Multiple datasets with multiple data sources
  • Image attention mask
  • Word attention multiplier
  • Custom words that occupy multiple words
  • Maximum sentence length expansion
  • πŸ€— Accelerate
  • Colossal-AI
  • xFormers for UNet and text-encoder
  • CLIP skip
  • Tag shuffle and dropout
  • Safetensors support
  • Controlnet (support training)
  • Min-SNR loss
  • Custom optimizer (Lion, DAdaptation, pytorch-optimizer, ...)
  • Custom lr scheduler
  • SDXL support

Install

Install with pip:

pip install hcpdiff
# Start a new project and make initialization
hcpinit

Install from source:

git clone https://github.com/7eu7d7/HCP-Diffusion.git
cd HCP-Diffusion
pip install -e .
# Modified based on this project or start a new project and make initialization
## hcpinit

To use xFormers to reduce VRAM usage and accelerate training:

# use conda
conda install xformers -c xformers

# use pip
pip install xformers>=0.0.17

User guidance

Training

Training scripts based on πŸ€— Accelerate or Colossal-AI are provided.

# with Accelerate
accelerate launch -m hcpdiff.train_ac --cfg cfgs/train/cfg_file.yaml
# with Accelerate and only one GPU
accelerate launch -m hcpdiff.train_ac_single --cfg cfgs/train/cfg_file.yaml
# with Colossal-AI
# pip install colossalai
torchrun --nproc_per_node 1 -m hcpdiff.train_colo --cfg cfgs/train/cfg_file.yaml

Inference

python -m hcpdiff.visualizer --cfg cfgs/infer/cfg.yaml pretrained_model=pretrained_model_path \
        prompt='positive_prompt' \
        neg_prompt='negative_prompt' \
        seed=42

Conversion of Stable Diffusion models

The framework is based on πŸ€— Diffusers. So it needs to convert the original Stable Diffusion model into a supported format using the scripts provided by πŸ€— Diffusers.

  • Download the config file
  • Convert models based on config file
python -m hcpdiff.tools.sd2diffusers \
    --checkpoint_path "path_to_stable_diffusion_model" \
    --original_config_file "path_to_config_file" \
    --dump_path "save_directory" \
    [--extract_ema] # Extract EMA model
    [--from_safetensors] # Whether the original model is in safetensors format
    [--to_safetensors] # Whether to save to safetensors format

Convert VAE:

python -m hcpdiff.tools.sd2diffusers \
    --vae_pt_path "path_to_VAE_model" \
    --original_config_file "path_to_config_file" \
    --dump_path "save_directory"
    [--from_safetensors]

Tutorials

Contributing

You are welcome to contribute more models and features to this toolbox!

Team

This toolbox is maintained by HCP-Lab, SYSU.

Citation

@article{DBLP:journals/corr/abs-2211-11337,
  author    = {Ziyi Dong and
               Pengxu Wei and
               Liang Lin},
  title     = {DreamArtist: Towards Controllable One-Shot Text-to-Image Generation
               via Positive-Negative Prompt-Tuning},
  journal   = {CoRR},
  volume    = {abs/2211.11337},
  year      = {2022},
  doi       = {10.48550/arXiv.2211.11337},
  eprinttype = {arXiv},
  eprint    = {2211.11337},
}

hcp-diffusion's People

Contributors

irisrainbowneko avatar narugo1992 avatar unrealmj avatar aria1th avatar coxy7 avatar iamwangyabin avatar chr1st0ph3rgg avatar kohakublueleaf avatar ritsuyan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.