Comments (12)
@bssrdf of course. I made an addon for Open Frameworks and do not use main.cpp at all (which complicates it a little): https://github.com/Jonathhhan/ofxStableDiffusion In this file happens most of the relevant stuff: https://github.com/Jonathhhan/ofxStableDiffusion/blob/main/ofxStableDiffusionExample/src/stableDiffusionThread.cpp
@Jonathhhan, I have reproduced the issue and implemented a fix. Please wait for the merged PR or you can try the branch. Thanks for reporting the bug.
from stable-diffusion.cpp.
@bssrdf thanks (I can confirm that it works now).
from stable-diffusion.cpp.
It looks like the cfg scale was too high for the first image.
from stable-diffusion.cpp.
@Jonathhhan, could you provide the full command line with SDXL and Photomaker model files? In particular, did you use the file from https://huggingface.co/bssrdf/PhotoMaker?
Here are what I can generate using Newton example images and your prompt with batch size 2.
bin/sd -m ../models/RealVisXL_V3.0.safetensors --stacked-id-embd-dir ../models/photomaker-v1.safetensors --input-id-images-dir examples/newton_man -p "man img, man with futuristic clothes" --cfg-scale 7 --sampling-method euler -H 1024 -W 1024 -b 2 -o newton_issu01.png
ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
[INFO ] stable-diffusion.cpp:165 - loading model from '../models/RealVisXL_V3.0.safetensors'
[INFO ] model.cpp:705 - load ../models/RealVisXL_V3.0.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:188 - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:194 - Stable Diffusion weight type: f16
[WARN ] stable-diffusion.cpp:200 - !!!It looks like you are using SDXL model. If you find that the generated images are completely black, try specifying SDXL VAE FP16 Fix with the --vae parameter. You can find it here: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors
[INFO ] model.cpp:705 - load ../models/photomaker-v1.safetensors using safetensors format
[INFO ] lora.hpp:38 - loading LoRA from '../models/photomaker-v1.safetensors'
[INFO ] stable-diffusion.cpp:275 - loading stacked ID embedding (PHOTOMAKER) model file from '../models/photomaker-v1.safetensors'
[INFO ] model.cpp:705 - load ../models/photomaker-v1.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:400 - total params memory size = 7182.38MB (VRAM 7182.38MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:419 - loading model from '../models/RealVisXL_V3.0.safetensors' completed, taking 88.15s
[INFO ] stable-diffusion.cpp:436 - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_3.jpg'
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.09s
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 548 ms
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 20 to 50 for PHOTOMAKER
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 157 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/2 - seed 42
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
|==================================================| 50/50 - 1.84it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 27.58s
[INFO ] stable-diffusion.cpp:1732 - generating image: 2/2 - seed 43
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
|==================================================| 50/50 - 1.79it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 27.52s
[INFO ] stable-diffusion.cpp:1777 - generating 2 latent images completed, taking 55.12s
[INFO ] stable-diffusion.cpp:1779 - decoding 2 latents
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.15s
[INFO ] stable-diffusion.cpp:1789 - latent 2 decoded, taking 1.17s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 2.31s
[INFO ] stable-diffusion.cpp:1810 - txt2img completed in 57.60s
save result image to 'newton_issu01.png'
save result image to 'newton_issu01_2.png'
double free or corruption (fasttop)
Aborted
They look fine.
from stable-diffusion.cpp.
@bssrdf batch processing works fine. The issue appears, if I run txt2img for a second time without reloading the sd_ctx. The console output looks exactly the same for both runs:
System Info: BLAS = 1 SSE3 = 1 AVX = 1 AVX2 = 1 AVX512 = 0 AVX512_VBMI = 0 AVX512_VNNI = 0 FMA = 1 NEON = 0 ARM_FMA = 0 F16C = 1 FP16_VA = 0 WASM_SIMD = 0 VSX = 0 New BaseEngine 00000202288E6220 New GLFWEngine 00000202288E6220 [DEBUG] stable-diffusion.cpp:145 - Using CUDA backend [notice ] EngineGLFW::setup(): Replaced the openFrameworks' GLFW event listeners by the imgui_impl_glfw ones. You will not have multi-window nor multi-context support. This can be enabled by defining OFXIMGUI_GLFW_FIX_MULTICONTEXT_PRIMARY_VP=1. ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes [INFO ] stable-diffusion.cpp:165 - loading model from 'data/models/sd_xl_base_1.0.safetensors' [INFO ] model.cpp:705 - load data/models/sd_xl_base_1.0.safetensors using safetensors format [DEBUG] model.cpp:771 - init from 'data/models/sd_xl_base_1.0.safetensors' [INFO ] stable-diffusion.cpp:176 - loading vae from 'data/models/vae/vae.safetensors' [INFO ] model.cpp:705 - load data/models/vae/vae.safetensors using safetensors format [DEBUG] model.cpp:771 - init from 'data/models/vae/vae.safetensors' [INFO ] stable-diffusion.cpp:188 - Stable Diffusion XL [INFO ] stable-diffusion.cpp:194 - Stable Diffusion weight type: f16 [DEBUG] stable-diffusion.cpp:195 - ggml tensor size = 432 bytes [DEBUG] ggml_extend.hpp:884 - clip params backend buffer size = 1564.36 MB(VRAM) (713 tensors) [DEBUG] ggml_extend.hpp:884 - unet params backend buffer size = 4900.07 MB(VRAM) (1680 tensors) [DEBUG] ggml_extend.hpp:884 - vae params backend buffer size = 159.68 MB(VRAM) (248 tensors) [INFO ] model.cpp:705 - load data/models/photomaker/photomaker-v1.safetensors using safetensors format [DEBUG] model.cpp:771 - init from 'data/models/photomaker/photomaker-v1.safetensors' [INFO ] lora.hpp:38 - loading LoRA from 'data/models/photomaker/photomaker-v1.safetensors' [DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors [DEBUG] ggml_extend.hpp:884 - lora params backend buffer size = 354.38 MB(VRAM) (10240 tensors) [DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors [DEBUG] lora.hpp:74 - finished loaded lora [INFO ] stable-diffusion.cpp:275 - loading stacked ID embedding (PHOTOMAKER) model file from 'data/models/photomaker/photomaker-v1.safetensors' [INFO ] model.cpp:705 - load data/models/photomaker/photomaker-v1.safetensors using safetensors format [DEBUG] model.cpp:771 - init from 'data/models/photomaker/photomaker-v1.safetensors' [DEBUG] ggml_extend.hpp:884 - pmid params backend buffer size = 623.48 MB(VRAM) (407 tensors) [DEBUG] stable-diffusion.cpp:296 - loading vocab [DEBUG] clip.hpp:164 - vocab size: 49408 [DEBUG] clip.hpp:175 - trigger word img already in vocab [DEBUG] stable-diffusion.cpp:316 - loading weights [DEBUG] model.cpp:1343 - loading tensors from data/models/sd_xl_base_1.0.safetensors [DEBUG] model.cpp:1343 - loading tensors from data/models/vae/vae.safetensors [DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors [INFO ] stable-diffusion.cpp:415 - total params memory size = 7247.59MB (VRAM 7247.59MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 159.68MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM) [INFO ] stable-diffusion.cpp:419 - loading model from 'data/models/sd_xl_base_1.0.safetensors' completed, taking 4.77s [INFO ] stable-diffusion.cpp:436 - running in eps-prediction mode [DEBUG] stable-diffusion.cpp:464 - finished loaded file [DEBUG] upscaler.cpp:19 - Using CUDA backend [INFO ] upscaler.cpp:32 - Upscaler weight type: f16 [INFO ] esrgan.hpp:164 - loading esrgan from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth' [DEBUG] ggml_extend.hpp:884 - esrgan params backend buffer size = 8.53 MB(VRAM) (192 tensors) [INFO ] model.cpp:708 - load data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth using checkpoint format [DEBUG] model.cpp:1221 - init from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth' [DEBUG] model.cpp:1343 - loading tensors from data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth [INFO ] esrgan.hpp:183 - esrgan model loaded [DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024 [INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg' [INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg' [INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png' [INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg' [DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes" [INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s [DEBUG] ggml_extend.hpp:835 - lora compute buffer size: 20.50 MB(VRAM) [INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.28s [DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ] [DEBUG] clip.hpp:1168 - token length: 77 [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM) [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM) [DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 86 ms [DEBUG] ggml_extend.hpp:835 - pmid compute buffer size: 40.31 MB(VRAM) [INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 161 ms [DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ] [INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER [DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ] [DEBUG] clip.hpp:1168 - token length: 77 [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM) [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM) [DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 61 ms [DEBUG] clip.hpp:1328 - parse '' to [['', 1], ] [DEBUG] clip.hpp:1168 - token length: 77 [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM) [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM) [DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 54 ms [INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 117 ms [INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method [INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2058 [INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10 [DEBUG] ggml_extend.hpp:835 - unet compute buffer size: 830.86 MB(VRAM) |==================================================| 50/50 - 1.28it/s [INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 41.23s [INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 41.23s [INFO ] stable-diffusion.cpp:1779 - decoding 1 latents [DEBUG] ggml_extend.hpp:835 - vae compute buffer size: 6656.00 MB(VRAM) [DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s [INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s [INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s [INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.56s [DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024 [INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg' [INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg' [INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png' [INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg' [DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes" [INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s [DEBUG] ggml_extend.hpp:835 - lora compute buffer size: 20.50 MB(VRAM) [INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.26s [DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ] [DEBUG] clip.hpp:1168 - token length: 77 [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM) [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM) [DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 53 ms [DEBUG] ggml_extend.hpp:835 - pmid compute buffer size: 40.31 MB(VRAM) [INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 127 ms [DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ] [INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER [DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ] [DEBUG] clip.hpp:1168 - token length: 77 [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM) [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM) [DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 55 ms [DEBUG] clip.hpp:1328 - parse '' to [['', 1], ] [DEBUG] clip.hpp:1168 - token length: 77 [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM) [DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM) [DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 53 ms [INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 111 ms [INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method [INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2215 [INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10 [DEBUG] ggml_extend.hpp:835 - unet compute buffer size: 830.86 MB(VRAM) |==================================================| 50/50 - 1.28it/s [INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 40.68s [INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 40.68s [INFO ] stable-diffusion.cpp:1779 - decoding 1 latents [DEBUG] ggml_extend.hpp:835 - vae compute buffer size: 6656.00 MB(VRAM) [DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s [INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s [INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s [INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.02s
Sorry, I mis-read your first message 😊
Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.
from stable-diffusion.cpp.
@Green-Sky yes, I used 7 and not the recommended 5.
from stable-diffusion.cpp.
@bssrdf batch processing works fine. The issue appears, if I run txt2img for a second time without reloading the sd_ctx. The console output looks exactly the same for both runs:
System Info:
BLAS = 1
SSE3 = 1
AVX = 1
AVX2 = 1
AVX512 = 0
AVX512_VBMI = 0
AVX512_VNNI = 0
FMA = 1
NEON = 0
ARM_FMA = 0
F16C = 1
FP16_VA = 0
WASM_SIMD = 0
VSX = 0
New BaseEngine 00000202288E6220
New GLFWEngine 00000202288E6220
[DEBUG] stable-diffusion.cpp:145 - Using CUDA backend
[notice ] EngineGLFW::setup(): Replaced the openFrameworks' GLFW event listeners by the imgui_impl_glfw ones. You will not have multi-window nor multi-context support. This can be enabled by defining OFXIMGUI_GLFW_FIX_MULTICONTEXT_PRIMARY_VP=1.
ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[INFO ] stable-diffusion.cpp:165 - loading model from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:705 - load data/models/sd_xl_base_1.0.safetensors using safetensors format
[DEBUG] model.cpp:771 - init from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] stable-diffusion.cpp:176 - loading vae from 'data/models/vae/vae.safetensors'
[INFO ] model.cpp:705 - load data/models/vae/vae.safetensors using safetensors format
[DEBUG] model.cpp:771 - init from 'data/models/vae/vae.safetensors'
[INFO ] stable-diffusion.cpp:188 - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:194 - Stable Diffusion weight type: f16
[DEBUG] stable-diffusion.cpp:195 - ggml tensor size = 432 bytes
[DEBUG] ggml_extend.hpp:884 - clip params backend buffer size = 1564.36 MB(VRAM) (713 tensors)
[DEBUG] ggml_extend.hpp:884 - unet params backend buffer size = 4900.07 MB(VRAM) (1680 tensors)
[DEBUG] ggml_extend.hpp:884 - vae params backend buffer size = 159.68 MB(VRAM) (248 tensors)
[INFO ] model.cpp:705 - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771 - init from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] lora.hpp:38 - loading LoRA from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] ggml_extend.hpp:884 - lora params backend buffer size = 354.38 MB(VRAM) (10240 tensors)
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] lora.hpp:74 - finished loaded lora
[INFO ] stable-diffusion.cpp:275 - loading stacked ID embedding (PHOTOMAKER) model file from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] model.cpp:705 - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771 - init from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] ggml_extend.hpp:884 - pmid params backend buffer size = 623.48 MB(VRAM) (407 tensors)
[DEBUG] stable-diffusion.cpp:296 - loading vocab
[DEBUG] clip.hpp:164 - vocab size: 49408
[DEBUG] clip.hpp:175 - trigger word img already in vocab
[DEBUG] stable-diffusion.cpp:316 - loading weights
[DEBUG] model.cpp:1343 - loading tensors from data/models/sd_xl_base_1.0.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/vae/vae.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[INFO ] stable-diffusion.cpp:415 - total params memory size = 7247.59MB (VRAM 7247.59MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 159.68MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:419 - loading model from 'data/models/sd_xl_base_1.0.safetensors' completed, taking 4.77s
[INFO ] stable-diffusion.cpp:436 - running in eps-prediction mode
[DEBUG] stable-diffusion.cpp:464 - finished loaded file
[DEBUG] upscaler.cpp:19 - Using CUDA backend
[INFO ] upscaler.cpp:32 - Upscaler weight type: f16
[INFO ] esrgan.hpp:164 - loading esrgan from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] ggml_extend.hpp:884 - esrgan params backend buffer size = 8.53 MB(VRAM) (192 tensors)
[INFO ] model.cpp:708 - load data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth using checkpoint format
[DEBUG] model.cpp:1221 - init from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] model.cpp:1343 - loading tensors from data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth
[INFO ] esrgan.hpp:183 - esrgan model loaded
[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835 - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.28s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 86 ms
[DEBUG] ggml_extend.hpp:835 - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 161 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 61 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 54 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 117 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2058
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835 - unet compute buffer size: 830.86 MB(VRAM)
|==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835 - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.56s
[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835 - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.26s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 53 ms
[DEBUG] ggml_extend.hpp:835 - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 127 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 55 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835 - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673 - computing condition graph completed, taking 53 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 111 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2215
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835 - unet compute buffer size: 830.86 MB(VRAM)
|==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835 - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.02s
from stable-diffusion.cpp.
Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.
@bssrdf good point. Yes, it works without photomaker (if the path to the photomaker model is empty). It crashes, if the model is loaded and I leave "man (something) img, " away (which is a non related issue, but could be a nice way to trigger photomaker).
from stable-diffusion.cpp.
Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.
@bssrdf good point. Yes, it works without photomaker (if the path to the photomaker model is empty). It crashes, if the model is loaded and I leave "man (something) img, " away (which is a non related issue, but could be a nice way to trigger photomaker).
@Jonathhhan, can you provide details about how to run 2 txt2img without reloading sd_ctx? Did you change the code in main.cpp?
from stable-diffusion.cpp.
@bssrdf of course. I made an addon for Open Frameworks and do not use main.cpp at all (which complicates it a little): https://github.com/Jonathhhan/ofxStableDiffusion
In this file happens most of the relevant stuff: https://github.com/Jonathhhan/ofxStableDiffusion/blob/main/ofxStableDiffusionExample/src/stableDiffusionThread.cpp
from stable-diffusion.cpp.
@Jonathhhan did you set the "isFreeParamsImmediatly" to false?
from stable-diffusion.cpp.
did you set the "isFreeParamsImmediatly" to false?
@fszontagh Yes.
from stable-diffusion.cpp.
Related Issues (20)
- --steps 0
- Be careful posting anime pictures! HOT 1
- Inference bottleneck HOT 11
- Suggestion: simple one-call function to make stable-diffusion.dll accessible for newbie coders
- Split the api to support comfyui like workflow HOT 1
- memsize was hardcoded in preprocess_canny function
- Macos binary is using an absolute library path HOT 1
- Vulkan support HOT 1
- [Feature Request] Enable Flash Attention in the released binary
- Enabling Flash Attention completely breaks prompt following ("Dog" and "Cat" generates identical image)
- [Feature Request] Support for SDXS-512, allowing for real-time image generation on the CPU (~0.6 seconds per image) HOT 4
- Much higher RAM usage (2-3 times) compared to FastSDCPU when using the exact same models/settings HOT 2
- Any GUI / webui for this? HOT 7
- Access GPU data of result
- Error when compile with latest GGML: ggml_quantize_chunk HOT 2
- Fixed random generator HOT 2
- TTI(Text 2 Image) inference using stable-diffusion.cpp doesn't work properly on Android phone HOT 2
- Support TCD Sampling method
- using the VAE HOT 3
- Opening model slows down inference HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stable-diffusion.cpp.