Hey, So I managed to run Stable Diffusion dreambooth training in just 17.7GB GPU usage

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Wow, Using the 8bit adam optimizer from <a href="https://github.com/TimDettmers/bitsan

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

DreamBooth Stable Diffusion training now possible in 10 GB VRAM, and it runs about 2 times faster.,about xavierxiao/dreambooth-stable-diffusion

Comments (52)

ShivamShrirao commented on July 23, 2024 5

https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth

New version now trains in 10 GB.

from dreambooth-stable-diffusion.

feffy380 commented on July 23, 2024 3

@Jarfeh This repo seems abandoned. Use ShivamShrirao's diffusers fork instead. It includes all the optimizations discussed here and some new ones

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024 2

@Blucknote hopefully pretty soon. I have gotten the GPU usage to 11.187 GB, but there are a few bugs due to which the model output quality isn't good right now even for higher precision. Will update once quality gets better.

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024 1

Wow, Using the 8bit adam optimizer from bitsandbytes along with xformers reduces the memory usage to 12.5 GB.
Colab: https://colab.research.google.com/github/ShivamShrirao/diffusers/blob/main/examples/dreambooth/DreamBooth_Stable_Diffusion.ipynb

Code: https://github.com/ShivamShrirao/diffusers/blob/main/examples/dreambooth/

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024 1

@roar-emaus These are the diffuser version of weights. I have added an inference example in colab on how to use them in diffusers. For others you will need to convert them.

from dreambooth-stable-diffusion.

Daniel-Kelvich commented on July 23, 2024 1

@pdjohntony try to update transformers library pip install -U transformers

from dreambooth-stable-diffusion.

ClashSAN commented on July 23, 2024 1

@ShivamShrirao I'm assuming you mean only the items in the imv folder make up the ckpt file, I deleted my colab and only saved those items to the google drive

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024 1

@JoeMcGuire you will need to compile the xformers, current wheels only support T4 GPU.

from dreambooth-stable-diffusion.

TemporalLabsLLC-SOL commented on July 23, 2024

Very cool. Doing what I can for 16gb too.

from dreambooth-stable-diffusion.

TemporalLabsLLC-SOL commented on July 23, 2024

I'm running into issues with it finding the gpus I think. 4xA10G. I'll post code tomorrow.

from dreambooth-stable-diffusion.

Daniel-Kelvich commented on July 23, 2024

There is no such file.
404 Client Error: Entry Not Found for url: https://huggingface.co/CompVis/stable-diffusion-v1-4/resolve/main/config.json

Edit: Issue resolved.

from dreambooth-stable-diffusion.

Mistborn-First-Era commented on July 23, 2024

Do you have a donation link? I don't have much, but you are doing great work.

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024

Do you have a donation link? I don't have much, but you are doing great work.

Hey, Thanks. No donation link haha. Good to hear you liked it. It has been quite fun to do for me.

from dreambooth-stable-diffusion.

pdjohntony commented on July 23, 2024

@ShivamShrirao I've been trying to run your notebook on Runpod with Pytorch and an A5000 but I'm getting an error during pip install "Building wheel for xformers (setup.py) ... error".
Training starts with a bitsandbytes bug report but runs and eventually after 20 min of training it crashes.

I'd also love to donate if I can get this working.

from dreambooth-stable-diffusion.

pdjohntony commented on July 23, 2024

There is no such file. 404 Client Error: Entry Not Found for url: https://huggingface.co/CompVis/stable-diffusion-v1-4/resolve/main/config.json

Edit: Issue resolved.

@Daniel-Kelvich How did you fix this?

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024

@pdjohntony What error are you facing ? If 404, it may be due to not being authenticated with huggingface cli.

from dreambooth-stable-diffusion.

pdjohntony commented on July 23, 2024

@ShivamShrirao I managed to get your dreambooth example working but its been running for 2 hours now on an A5000.

Since thats taking so long, I spun up another instance on vast with 2 A5000's but now I'm getting the 404. It shouldn't be an auth issue with huggingface as a logged in on the CLI and it appeared to download the model for a while before getting this 404 error.

The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_cpu_threads_per_process` was set to `24` to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
Traceback (most recent call last):
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/configuration_utils.py", line 596, in _get_config_dict
    resolved_config_file = cached_path(
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/utils/hub.py", line 282, in cached_path
    output_path = get_from_cache(
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/utils/hub.py", line 486, in get_from_cache
    _raise_for_status(r)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/utils/hub.py", line 409, in _raise_for_status
    raise EntryNotFoundError(f"404 Client Error: Entry Not Found for url: {request.url}")
transformers.utils.hub.EntryNotFoundError: 404 Client Error: Entry Not Found for url: https://huggingface.co/CompVis/stable-diffusion-v1-4/resolve/main/config.json

from dreambooth-stable-diffusion.

roar-emaus commented on July 23, 2024

Great work! I managed to run it in a google colab. I was just wondering, how do I get checkpoint files that I can use later on from the model files that are stored?

I could only find the
feature_extractor logs model_index.json safety_checker scheduler text_encoder tokenizer unet vae folders/files that were stored in the --output_dir=$OUTPUT_DIR after it was done training.

from dreambooth-stable-diffusion.

roar-emaus commented on July 23, 2024

@roar-emaus These are the diffuser version of weights. I have added an inference example in colab on how to use them in diffusers. For others you will need to convert them.

Thank you! will test it tomorrow :)

from dreambooth-stable-diffusion.

Ai-Artsca commented on July 23, 2024

finally got it to work, how can we use the model to reuse in a stable colab @ShivamShrirao ? I have used the inference but how do i save my model, i havent even been able to find what folder its in lol, any info on how to convert it into a ckpt?? great work !!

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024

finally got it to work, how can we use the model to reuse in a stable colab @ShivamShrirao ? I have used the inference but how do i save my model, i havent even been able to find what folder its in lol, any info on how to convert it into a ckpt?? great work !!

I haven't figured out yet how to convert to single ckpt to use in other repos. Currently the whole folder is your model, you can save the whole folder until someone figures it out. This needs to be reversed https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py

from dreambooth-stable-diffusion.

hopibel commented on July 23, 2024

@ShivamShrirao If I'm reading things right, 8bit AdamW should be a drop in replacement and the modified CrossAttention class seems like it should just be able to replace the one in ldm/modules/attention.py in this repository. Sadly can't test it myself because bitsandbytes has a C extension that uses CUDA and I'm on AMD

from dreambooth-stable-diffusion.

Ai-Artsca commented on July 23, 2024

successfully trained one model, but my second time training im getting an error @ShivamShrirao

Steps: 2% 18/1000 [00:56<45:45, 2.80s/it, loss=0.536, lr=5e-6]Traceback (most recent call last):
File "train_dreambooth.py", line 606, in
main()
File "train_dreambooth.py", line 527, in main
for step, batch in enumerate(train_dataloader):
File "/usr/local/lib/python3.7/dist-packages/accelerate/data_loader.py", line 357, in iter
next_batch = next(dataloader_iter)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 721, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "train_dreambooth.py", line 268, in getitem
instance_image = Image.open(self.instance_images_path[index % self.num_instance_images])
File "/usr/local/lib/python3.7/dist-packages/PIL/Image.py", line 2843, in open
fp = builtins.open(filename, "rb")
IsADirectoryError: [Errno 21] Is a directory: '/content/data/sks/.ipynb_checkpoints'
Steps: 2% 18/1000 [00:56<51:30, 3.15s/it, loss=0.536, lr=5e-6]
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth.py', '--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4', '--use_auth_token', '--instance_data_dir=/content/data/sks', '--class_data_dir=/content/data/gfx', '--output_dir=/content/models/sks', '--with_prior_preservation', '--instance_prompt=photo of sks gfx', '--class_prompt=photo of a gfx', '--resolution=512', '--use_8bit_adam', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=5e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=200', '--max_train_steps=1000']' returned non-zero exit status 1.

from dreambooth-stable-diffusion.

TemporalLabsLLC-SOL commented on July 23, 2024

Very nice progress! Digging in more now

from dreambooth-stable-diffusion.

binarymind commented on July 23, 2024

@ShivamShrirao

in the collab

  --instance_prompt="photo of imv{CLASS_NAME}" \
  --class_prompt="photo of a {CLASS_NAME}" \

are no f strings, they should be right ?

cheers

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024

@binarymind Not required here cause it executes as a shell command.

from dreambooth-stable-diffusion.

binarymind commented on July 23, 2024

ok thanks !

during this cell I got the following result

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--num_cpu_threads_per_process` was set to `32` to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
/opt/conda/lib/python3.7/site-packages/accelerate/accelerator.py:179: UserWarning: `log_with=tensorboard` was passed but no supported trackers are currently installed.
  warnings.warn(f"`log_with={log_with}` was passed but no supported trackers are currently installed.")
Fetching 16 files: 100%|█████████████████████| 16/16 [00:00<00:00, 13678.94it/s]
Generating class images:   0%|                           | 0/25 [00:00<?, ?it/s]FATAL: this function is for sm80, but was built for sm750
FATAL: this function is for sm80, but was built for sm750

my nvidia-smi is the following

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.94       Driver Version: 470.94       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A6000    On   | 00000000:0F:00.0 Off |                  Off |
| 30%   27C    P8    26W / 300W |      1MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

I tried also to do the

%pip install git+https://github.com/facebookresearch/xformers@1d31a3a#egg=xformers some cells above as it was not working.
currently stucked there

from dreambooth-stable-diffusion.

binarymind commented on July 23, 2024

Lol I fixed my problem by removing the f strings I added.... sorry

edit: ah nope was not that, launched again the notebook on a new repo and the problem appear again, looking at it

from dreambooth-stable-diffusion.

TheChapster commented on July 23, 2024

I'm hoping for a (fingers crossed not too distant) future version of this that can run on requirements of a 3080. Will put it into reach of many more people including myself. Keep up the great work!!

from dreambooth-stable-diffusion.

JoeMcGuire commented on July 23, 2024

I'm not having any success. Trying to use V100 on colab.

Generating class images:   0% 0/50 [00:06<?, ?it/s]
Traceback (most recent call last):
  File "train_dreambooth.py", line 606, in <module>
    main()
  File "train_dreambooth.py", line 362, in main
    images = pipeline(example["prompt"]).images
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 259, in __call__
    noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/diffusers/models/unet_2d_condition.py", line 254, in forward
    encoder_hidden_states=encoder_hidden_states,
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/diffusers/models/unet_blocks.py", line 565, in forward
    hidden_states = attn(hidden_states, context=encoder_hidden_states)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/diffusers/models/attention.py", line 155, in forward
    hidden_states = block(hidden_states, context=context)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/diffusers/models/attention.py", line 204, in forward
    hidden_states = self.attn1(self.norm1(hidden_states)) + hidden_states
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/diffusers/models/attention.py", line 288, in forward
    hidden_states = xformers.ops.memory_efficient_attention(query, key, value)
  File "/usr/local/lib/python3.7/dist-packages/xformers/ops.py", line 575, in memory_efficient_attention
    query=query, key=key, value=value, attn_bias=attn_bias, p=p
  File "/usr/local/lib/python3.7/dist-packages/xformers/ops.py", line 196, in forward_no_grad
    causal=isinstance(attn_bias, LowerTriangularMask),
  File "/usr/local/lib/python3.7/dist-packages/torch/_ops.py", line 143, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth.py', '--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4', '--use_auth_token', '--instance_data_dir=/content/data/sks', '--class_data_dir=/content/data/dog', '--output_dir=/content/models/sks', '--with_prior_preservation', '--instance_prompt=photo of sks dog', '--class_prompt=photo of a dog', '--resolution=512', '--use_8bit_adam', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=5e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=200', '--max_train_steps=600']' returned non-zero exit status 1

from dreambooth-stable-diffusion.

1blackbar commented on July 23, 2024

there are xformers for p100 on this colab precompiled, how to incorporate those into dreambooth ? It will cover colab pro
https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/fast_stable_diffusion_AUTOMATIC1111.ipynb#scrollTo=a---cT2rwUQj

under installing xformers
Also how about optional googledrive cell to upload trained model + prune cell to get it to 2gb?
If some of You will compile whl for p100 please download and store it in gdrive to share

from dreambooth-stable-diffusion.

1blackbar commented on July 23, 2024

yeah , now its kinda not useable on webuis and most people are on webuis, huggingface love their bins also default 600 steps are pretty bad, not sure why its default ? should be more like at least 2000

from dreambooth-stable-diffusion.

Blucknote commented on July 23, 2024

Any chances to run on 12GB rtx 3060?
I'm getting Tried to allocate 4.00 GiB (GPU 0; 12.00 GiB total capacity; 4.81 GiB already allocated; 890.00 MiB free; 8.81 GiB reserved in total by PyTorch) error even with --use_8bit_adam flag

from dreambooth-stable-diffusion.

TemporalLabsLLC-SOL commented on July 23, 2024

Can we get a link to the json or description on that?

from dreambooth-stable-diffusion.

TemporalLabsLLC-SOL commented on July 23, 2024

The following values were not passed to accelerate launch and had defaults used instead:
--num_cpu_threads_per_process was set to 4 to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
Traceback (most recent call last):
File "train_dreambooth.py", line 608, in
main()
File "train_dreambooth.py", line 394, in main
tokenizer = CLIPTokenizer.from_pretrained(
File "c:\users\urban\anaconda3\envs\ldm\lib\site-packages\transformers\tokenization_utils_base.py", line 1764, in from_pretrained
raise EnvironmentError(
OSError: Can't load tokenizer for '/CompVis/stable-diffusion-v1-4'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure '/CompVis/stable-diffusion-v1-4' is the correct path to a directory containing all relevant files for a CLIPTokenizer tokenizer.
Traceback (most recent call last):
File "c:\users\urban\anaconda3\envs\ldm\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\urban\anaconda3\envs\ldm\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\Urban\anaconda3\envs\ldm\Scripts\accelerate.exe_main.py", line 7, in
File "c:\users\urban\anaconda3\envs\ldm\lib\site-packages\accelerate\commands\accelerate_cli.py", line 43, in main
args.func(args)
File "c:\users\urban\anaconda3\envs\ldm\lib\site-packages\accelerate\commands\launch.py", line 837, in launch_command
simple_launcher(args)
File "c:\users\urban\anaconda3\envs\ldm\lib\site-packages\accelerate\commands\launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

SOLVE = pip install --upgrade transformers

from dreambooth-stable-diffusion.

TemporalLabsLLC-SOL commented on July 23, 2024

I've tried both local directories matching and making sure there are zero that match. So close. Would appreciate any help anybody has to offer.

from dreambooth-stable-diffusion.

konimaki2022 commented on July 23, 2024

Hello, i have trained on an RTX 2060 with a stable consumption of 10.8GB of VRAM and at an amazing speed, between 5 and 10 minutes!

These are the details of my configuration:

torch and torchvision compiled with support for cuda 11.6
accelerate configured to use --mixed_precision with bf16
reduced size of training images with --resolution=256
with 3-5 images for instance, and 12-20 images for class, 1000 training steps.

I obtain very good results.

from dreambooth-stable-diffusion.

guumaster commented on July 23, 2024

@konimaki2022 can you share your notebook?

from dreambooth-stable-diffusion.

konimaki2022 commented on July 23, 2024

@guumaster sorry I haven't created a notebook in Google Colab yet, I run it on my local computer with Ubuntu 20.04, no cloud.

from dreambooth-stable-diffusion.

TemporalLabsLLC-SOL commented on July 23, 2024

@guumaster sorry I haven't created a notebook in Google Colab yet, I run it on my local computer with Ubuntu 20.04, no cloud.

I think Ubuntu is the key. Because we have to redirect Cuda drivers to invoke adam right in windows it's cause two straight days of work. Close hopefully

from dreambooth-stable-diffusion.

TemporalLabsLLC-SOL commented on July 23, 2024

I've learned a lot and I think a more stable and universal windows local solution is close.

from dreambooth-stable-diffusion.

TheChapster commented on July 23, 2024

https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth

New version now trains in 10 GB.

Awesome!! I assume this wont work with a 10GB GPU still, due to other apps using it. If anyone knows of a way to get it working with that, such as utilising shared memory (not worrying about a decrease in performance), that would be fantastic!! If not, I look forward to future progressions!

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024

@TheChapster It might work on linux where you can have no other application running on the GPU, or might need just a few modifications. I don't have a 10GB GPU to test it so can't confirm.

from dreambooth-stable-diffusion.

hopibel commented on July 23, 2024

https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth

New version now trains in 10 GB.

Can we get a row or two in the table with all optimizations on except for use_8bit_adam? The bitsandbytes library relies on a C extension to wrap some CUDA functions, so it can't be used on AMD

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024

@hopibel Check the last row.

from dreambooth-stable-diffusion.

hopibel commented on July 23, 2024

Ah, missed it somehow. Dang, looks too close to 16GB to fit

from dreambooth-stable-diffusion.

AmericanPresidentJimmyCarter commented on July 23, 2024

With xformers and triton in this my fork at FP16 it trains with slightly less than 14 GB... I haven't pushed the branch but it seems fine.

This is using stable-diffusion and EMA weights, not diffusers at all.

from dreambooth-stable-diffusion.

ShivamShrirao commented on July 23, 2024

Now you can convert diffusers weights to ckpt, thanks to https://gist.github.com/jachiam/8a5c0b607e38fcc585168b90c686eb05

I have updated it in my colab.

from dreambooth-stable-diffusion.

andreae293 commented on July 23, 2024

With xformers and triton in this my fork at FP16 it trains with slightly less than 14 GB... I haven't pushed the branch but it seems fine.

This is using stable-diffusion and EMA weights, not diffusers at all.

can you push it? thanks

from dreambooth-stable-diffusion.

Jarfeh commented on July 23, 2024

With xformers and triton in this my fork at FP16 it trains with slightly less than 14 GB... I haven't pushed the branch but it seems fine.

This is using stable-diffusion and EMA weights, not diffusers at all.

Like andrae293, I too would like to see you push this to be available :)

from dreambooth-stable-diffusion.

titusfx commented on July 23, 2024

@Jarfeh I agree with @feffy380

https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth

from dreambooth-stable-diffusion.

Hbhatt-merexgenAI commented on July 23, 2024

I tried to run the Google Colab, I have RTX 3060 12Gb but doesnt work

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 11.75 GiB total capacity; 8.06 GiB already allocated; 1.95 GiB free; 8.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 7772) of binary: /home/merexai-dev/miniconda3/envs/tf/bin/python
Traceback (most recent call last):

from dreambooth-stable-diffusion.

DreamBooth Stable Diffusion training now possible in 10 GB VRAM, and it runs about 2 times faster. about dreambooth-stable-diffusion HOT 52 OPEN

Comments (52)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent