Very nice feature, thanks. It seems that only the first generated image works after lo

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

photoMaker issue with two or more generated images / SDXL sample steps,about leejet/stable-diffusion.cpp

Comments (12)

bssrdf commented on June 14, 2024 2

@bssrdf of course. I made an addon for Open Frameworks and do not use main.cpp at all (which complicates it a little): https://github.com/Jonathhhan/ofxStableDiffusion In this file happens most of the relevant stuff: https://github.com/Jonathhhan/ofxStableDiffusion/blob/main/ofxStableDiffusionExample/src/stableDiffusionThread.cpp

@Jonathhhan, I have reproduced the issue and implemented a fix. Please wait for the merged PR or you can try the branch. Thanks for reporting the bug.

from stable-diffusion.cpp.

Jonathhhan commented on June 14, 2024 2

@bssrdf thanks (I can confirm that it works now).

from stable-diffusion.cpp.

Green-Sky commented on June 14, 2024 1

It looks like the cfg scale was too high for the first image.

from stable-diffusion.cpp.

bssrdf commented on June 14, 2024 1

@Jonathhhan, could you provide the full command line with SDXL and Photomaker model files? In particular, did you use the file from https://huggingface.co/bssrdf/PhotoMaker?

Here are what I can generate using Newton example images and your prompt with batch size 2.

bin/sd -m ../models/RealVisXL_V3.0.safetensors  --stacked-id-embd-dir ../models/photomaker-v1.safetensors --input-id-images-dir examples/newton_man -p "man img, man with futuristic clothes"  --cfg-scale 7 --sampling-method euler -H 1024 -W 1024  -b 2 -o newton_issu01.png
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
[INFO ] stable-diffusion.cpp:165  - loading model from '../models/RealVisXL_V3.0.safetensors'
[INFO ] model.cpp:705  - load ../models/RealVisXL_V3.0.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:188  - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:194  - Stable Diffusion weight type: f16
[WARN ] stable-diffusion.cpp:200  - !!!It looks like you are using SDXL model. If you find that the generated images are completely black, try specifying SDXL VAE FP16 Fix with the --vae parameter. You can find it here: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors
[INFO ] model.cpp:705  - load ../models/photomaker-v1.safetensors using safetensors format
[INFO ] lora.hpp:38   - loading LoRA from '../models/photomaker-v1.safetensors'
[INFO ] stable-diffusion.cpp:275  - loading stacked ID embedding (PHOTOMAKER) model file from '../models/photomaker-v1.safetensors'
[INFO ] model.cpp:705  - load ../models/photomaker-v1.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:400  - total params memory size = 7182.38MB (VRAM 7182.38MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:419  - loading model from '../models/RealVisXL_V3.0.safetensors' completed, taking 88.15s
[INFO ] stable-diffusion.cpp:436  - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'examples/newton_man/newton_3.jpg'
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.09s
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 548 ms
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 20 to 50 for PHOTOMAKER
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 157 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/2 - seed 42
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
  |==================================================| 50/50 - 1.84it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 27.58s
[INFO ] stable-diffusion.cpp:1732 - generating image: 2/2 - seed 43
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
  |==================================================| 50/50 - 1.79it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 27.52s
[INFO ] stable-diffusion.cpp:1777 - generating 2 latent images completed, taking 55.12s
[INFO ] stable-diffusion.cpp:1779 - decoding 2 latents
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.15s
[INFO ] stable-diffusion.cpp:1789 - latent 2 decoded, taking 1.17s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 2.31s
[INFO ] stable-diffusion.cpp:1810 - txt2img completed in 57.60s
save result image to 'newton_issu01.png'
save result image to 'newton_issu01_2.png'
double free or corruption (fasttop)
Aborted

They look fine.

from stable-diffusion.cpp.

bssrdf commented on June 14, 2024 1

@bssrdf batch processing works fine. The issue appears, if I run txt2img for a second time without reloading the sd_ctx. The console output looks exactly the same for both runs:

System Info:
    BLAS = 1
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
New BaseEngine 00000202288E6220
New GLFWEngine 00000202288E6220
[DEBUG] stable-diffusion.cpp:145  - Using CUDA backend
[notice ] EngineGLFW::setup(): Replaced the openFrameworks' GLFW event listeners by the imgui_impl_glfw ones. You will not have multi-window nor multi-context support. This can be enabled by defining OFXIMGUI_GLFW_FIX_MULTICONTEXT_PRIMARY_VP=1.
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[INFO ] stable-diffusion.cpp:165  - loading model from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:705  - load data/models/sd_xl_base_1.0.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] stable-diffusion.cpp:176  - loading vae from 'data/models/vae/vae.safetensors'
[INFO ] model.cpp:705  - load data/models/vae/vae.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/vae/vae.safetensors'
[INFO ] stable-diffusion.cpp:188  - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:194  - Stable Diffusion weight type: f16
[DEBUG] stable-diffusion.cpp:195  - ggml tensor size = 432 bytes
[DEBUG] ggml_extend.hpp:884  - clip params backend buffer size =  1564.36 MB(VRAM) (713 tensors)
[DEBUG] ggml_extend.hpp:884  - unet params backend buffer size =  4900.07 MB(VRAM) (1680 tensors)
[DEBUG] ggml_extend.hpp:884  - vae params backend buffer size =  159.68 MB(VRAM) (248 tensors)
[INFO ] model.cpp:705  - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] lora.hpp:38   - loading LoRA from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] ggml_extend.hpp:884  - lora params backend buffer size =  354.38 MB(VRAM) (10240 tensors)
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] lora.hpp:74   - finished loaded lora
[INFO ] stable-diffusion.cpp:275  - loading stacked ID embedding (PHOTOMAKER) model file from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] model.cpp:705  - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] ggml_extend.hpp:884  - pmid params backend buffer size =  623.48 MB(VRAM) (407 tensors)
[DEBUG] stable-diffusion.cpp:296  - loading vocab
[DEBUG] clip.hpp:164  - vocab size: 49408
[DEBUG] clip.hpp:175  -  trigger word img already in vocab
[DEBUG] stable-diffusion.cpp:316  - loading weights
[DEBUG] model.cpp:1343 - loading tensors from data/models/sd_xl_base_1.0.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/vae/vae.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[INFO ] stable-diffusion.cpp:415  - total params memory size = 7247.59MB (VRAM 7247.59MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 159.68MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:419  - loading model from 'data/models/sd_xl_base_1.0.safetensors' completed, taking 4.77s
[INFO ] stable-diffusion.cpp:436  - running in eps-prediction mode
[DEBUG] stable-diffusion.cpp:464  - finished loaded file
[DEBUG] upscaler.cpp:19   - Using CUDA backend
[INFO ] upscaler.cpp:32   - Upscaler weight type: f16
[INFO ] esrgan.hpp:164  - loading esrgan from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] ggml_extend.hpp:884  - esrgan params backend buffer size =   8.53 MB(VRAM) (192 tensors)
[INFO ] model.cpp:708  - load data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth using checkpoint format
[DEBUG] model.cpp:1221 - init from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] model.cpp:1343 - loading tensors from data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth
[INFO ] esrgan.hpp:183  - esrgan model loaded
[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835  - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.28s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 86 ms
[DEBUG] ggml_extend.hpp:835  - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 161 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 61 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 54 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 117 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2058
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835  - unet compute buffer size: 830.86 MB(VRAM)
  |==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835  - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.56s
[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835  - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.26s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 53 ms
[DEBUG] ggml_extend.hpp:835  - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 127 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 55 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 53 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 111 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2215
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835  - unet compute buffer size: 830.86 MB(VRAM)
  |==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835  - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.02s

Sorry, I mis-read your first message 😊
Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.

from stable-diffusion.cpp.

Jonathhhan commented on June 14, 2024

@Green-Sky yes, I used 7 and not the recommended 5.

from stable-diffusion.cpp.

Jonathhhan commented on June 14, 2024

@bssrdf batch processing works fine. The issue appears, if I run txt2img for a second time without reloading the sd_ctx. The console output looks exactly the same for both runs:

System Info:
    BLAS = 1
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
New BaseEngine 00000202288E6220
New GLFWEngine 00000202288E6220
[DEBUG] stable-diffusion.cpp:145  - Using CUDA backend
[notice ] EngineGLFW::setup(): Replaced the openFrameworks' GLFW event listeners by the imgui_impl_glfw ones. You will not have multi-window nor multi-context support. This can be enabled by defining OFXIMGUI_GLFW_FIX_MULTICONTEXT_PRIMARY_VP=1.
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
[INFO ] stable-diffusion.cpp:165  - loading model from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:705  - load data/models/sd_xl_base_1.0.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/sd_xl_base_1.0.safetensors'
[INFO ] stable-diffusion.cpp:176  - loading vae from 'data/models/vae/vae.safetensors'
[INFO ] model.cpp:705  - load data/models/vae/vae.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/vae/vae.safetensors'
[INFO ] stable-diffusion.cpp:188  - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:194  - Stable Diffusion weight type: f16
[DEBUG] stable-diffusion.cpp:195  - ggml tensor size = 432 bytes
[DEBUG] ggml_extend.hpp:884  - clip params backend buffer size =  1564.36 MB(VRAM) (713 tensors)
[DEBUG] ggml_extend.hpp:884  - unet params backend buffer size =  4900.07 MB(VRAM) (1680 tensors)
[DEBUG] ggml_extend.hpp:884  - vae params backend buffer size =  159.68 MB(VRAM) (248 tensors)
[INFO ] model.cpp:705  - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] lora.hpp:38   - loading LoRA from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] ggml_extend.hpp:884  - lora params backend buffer size =  354.38 MB(VRAM) (10240 tensors)
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[DEBUG] lora.hpp:74   - finished loaded lora
[INFO ] stable-diffusion.cpp:275  - loading stacked ID embedding (PHOTOMAKER) model file from 'data/models/photomaker/photomaker-v1.safetensors'
[INFO ] model.cpp:705  - load data/models/photomaker/photomaker-v1.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from 'data/models/photomaker/photomaker-v1.safetensors'
[DEBUG] ggml_extend.hpp:884  - pmid params backend buffer size =  623.48 MB(VRAM) (407 tensors)
[DEBUG] stable-diffusion.cpp:296  - loading vocab
[DEBUG] clip.hpp:164  - vocab size: 49408
[DEBUG] clip.hpp:175  -  trigger word img already in vocab
[DEBUG] stable-diffusion.cpp:316  - loading weights
[DEBUG] model.cpp:1343 - loading tensors from data/models/sd_xl_base_1.0.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/vae/vae.safetensors
[DEBUG] model.cpp:1343 - loading tensors from data/models/photomaker/photomaker-v1.safetensors
[INFO ] stable-diffusion.cpp:415  - total params memory size = 7247.59MB (VRAM 7247.59MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 159.68MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:419  - loading model from 'data/models/sd_xl_base_1.0.safetensors' completed, taking 4.77s
[INFO ] stable-diffusion.cpp:436  - running in eps-prediction mode
[DEBUG] stable-diffusion.cpp:464  - finished loaded file
[DEBUG] upscaler.cpp:19   - Using CUDA backend
[INFO ] upscaler.cpp:32   - Upscaler weight type: f16
[INFO ] esrgan.hpp:164  - loading esrgan from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] ggml_extend.hpp:884  - esrgan params backend buffer size =   8.53 MB(VRAM) (192 tensors)
[INFO ] model.cpp:708  - load data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth using checkpoint format
[DEBUG] model.cpp:1221 - init from 'data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth'
[DEBUG] model.cpp:1343 - loading tensors from data/models/esrgan/RealESRGAN_x4plus_anime_6B.pth
[INFO ] esrgan.hpp:183  - esrgan model loaded


[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835  - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.28s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 86 ms
[DEBUG] ggml_extend.hpp:835  - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 161 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 61 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 54 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 117 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2058
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835  - unet compute buffer size: 830.86 MB(VRAM)
  |==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 41.23s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835  - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.56s


[DEBUG] stable-diffusion.cpp:1551 - txt2img 1024x1024
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_0.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_1.jpg'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_2.png'
[INFO ] stable-diffusion.cpp:1572 - PhotoMaker loaded image from 'C:\Users\Jonat\Desktop\of_v20240306_vs_release\addons\ofxStableDiffusion\ofxStableDiffusionExample\bin\data/photomaker_images/newton_man\newton_3.jpg'
[DEBUG] stable-diffusion.cpp:1597 - prompt after extract and remove lora: "man img, man with futuristic clothes"
[INFO ] stable-diffusion.cpp:1602 - apply_loras completed, taking 0.00s
[DEBUG] ggml_extend.hpp:835  - lora compute buffer size: 20.50 MB(VRAM)
[INFO ] stable-diffusion.cpp:1608 - pmid_lora apply completed, taking 0.26s
[DEBUG] clip.hpp:1222 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 53 ms
[DEBUG] ggml_extend.hpp:835  - pmid compute buffer size: 40.31 MB(VRAM)
[INFO ] stable-diffusion.cpp:1672 - Photomaker ID Stacking, taking 127 ms
[DEBUG] clip.hpp:1328 - parse 'man img, man with futuristic clothes' to [['man img, man with futuristic clothes', 1], ]
[INFO ] stable-diffusion.cpp:1681 - sampling steps increases from 15 to 50 for PHOTOMAKER
[DEBUG] clip.hpp:1328 - parse 'man , man with futuristic clothes' to [['man , man with futuristic clothes', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 55 ms
[DEBUG] clip.hpp:1328 - parse '' to [['', 1], ]
[DEBUG] clip.hpp:1168 - token length: 77
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 2.56 MB(VRAM)
[DEBUG] ggml_extend.hpp:835  - clip compute buffer size: 8.58 MB(VRAM)
[DEBUG] stable-diffusion.cpp:673  - computing condition graph completed, taking 53 ms
[INFO ] stable-diffusion.cpp:1712 - get_learned_condition completed, taking 111 ms
[INFO ] stable-diffusion.cpp:1728 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1732 - generating image: 1/1 - seed 2215
[INFO ] stable-diffusion.cpp:1745 - PHOTOMAKER: start_merge_step: 10
[DEBUG] ggml_extend.hpp:835  - unet compute buffer size: 830.86 MB(VRAM)
  |==================================================| 50/50 - 1.28it/s
[INFO ] stable-diffusion.cpp:1769 - sampling completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1777 - generating 1 latent images completed, taking 40.68s
[INFO ] stable-diffusion.cpp:1779 - decoding 1 latents
[DEBUG] ggml_extend.hpp:835  - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1447 - computing vae [mode: DECODE] graph completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1789 - latent 1 decoded, taking 1.22s
[INFO ] stable-diffusion.cpp:1793 - decode_first_stage completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1812 - txt2img completed in 42.02s

from stable-diffusion.cpp.

Jonathhhan commented on June 14, 2024

Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.

@bssrdf good point. Yes, it works without photomaker (if the path to the photomaker model is empty). It crashes, if the model is loaded and I leave "man (something) img, " away (which is a non related issue, but could be a nice way to trigger photomaker).

from stable-diffusion.cpp.

bssrdf commented on June 14, 2024

Can you try running more than one txt2img call but without photomaker? Just to isolate whether this is a photomaker specific issue.

@bssrdf good point. Yes, it works without photomaker (if the path to the photomaker model is empty). It crashes, if the model is loaded and I leave "man (something) img, " away (which is a non related issue, but could be a nice way to trigger photomaker).

@Jonathhhan, can you provide details about how to run 2 txt2img without reloading sd_ctx? Did you change the code in main.cpp?

from stable-diffusion.cpp.

Jonathhhan commented on June 14, 2024

@bssrdf of course. I made an addon for Open Frameworks and do not use main.cpp at all (which complicates it a little): https://github.com/Jonathhhan/ofxStableDiffusion
In this file happens most of the relevant stuff: https://github.com/Jonathhhan/ofxStableDiffusion/blob/main/ofxStableDiffusionExample/src/stableDiffusionThread.cpp

from stable-diffusion.cpp.

fszontagh commented on June 14, 2024

@Jonathhhan did you set the "isFreeParamsImmediatly" to false?

from stable-diffusion.cpp.

Jonathhhan commented on June 14, 2024

did you set the "isFreeParamsImmediatly" to false?

@fszontagh Yes.

from stable-diffusion.cpp.

photoMaker issue with two or more generated images / SDXL sample steps about stable-diffusion.cpp HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent