hkproj / pytorch-stable-diffusion Goto Github PK

View Code? Open in Web Editor NEW

373.0 7.0 81.0 1.97 MB

Stable Diffusion implemented from scratch in PyTorch

Home Page: https://www.youtube.com/watch?v=ZBKpAp_6TGI

License: MIT License

Jupyter Notebook 84.70% Python 15.30%

diffusion-models latent-diffusion-models paper-implementations pytorch pytorch-implementation stable-diffusion

pytorch-stable-diffusion's Issues

Code for training the model

Can you also write a simple file for training a database of images? @hkproj

Hello. Thank you so much for sharing your video and code . I found yours is the most easy one to follow.
Do you have a plan to write about training code ? Maybe fine tuning the pretrained model with some specific domain data.
Regards

Small Typo

In attention.py line 39 you misspelled casual_mask

Not an Issue

Will you post a tutorial on how to train/fine tune the model ?

How to run this project ?

I'm newbie. Can you make a video to run this project ?
Thanks

Question about the output of the Unet

Hi, thank you so much for your work and sharing.

I want to retrain the Unet from the beginning, may I ask what is the output of the Unet according to this ddpm scheduler? Whether it is the random noise or the noise computed with the specific time step? I saw other codes, they use the random noise as the output of the model, but my model only works when I used the later one.

Highly appreciate your help!

Parallel Computing

How can I use the code on multiple GPUs using torch.nn.DataParallel? It runs out of memory when using one GPU. The kernel dies when inferring on CPU.

PDF Slides

Hi!
Can you upload the PDF slides?

Duplicated links

As shown in the image, both link are equal at README.md.

Question about model training.

I watched all your videos and followed along, it tooks about 5 days 😀, it's very fun and appreciate you!
Now I wonder how to train this model.

I also watched another video of yours “How diffusion models work - explanation and code!”.
This is also very useful and great video, thank you again!!
The video was about how to train unet(diffusion model) for latent denosing.

But we have four major models in here:
VAE-encoder, VAE-decoder, unet, and clip

If we want to train unet(diffusion mode) like "diffusion model training youtube",
does we freeze other models and train only unet?

However, the definition of learning is not well understood.
For example, if we want to create image B with a specific style of A, like A image -> styled B image

Where should I feed images A or random(input) and styled B(output), respectively?
The inference will look like this, but I don't know how to do it in training phase.
A(or random) -> VAE-encode -> [ z, clip-emb, time-emb -> unet -> z] * loop -> VAE-decode -> B

It is also questionable whether clip-embeding should just be left blank or random or specific text prompt?.
or should I input A image for clip-embeding?

I have searched on youtube for that how people train stable diffusion model then most video was using dreambooth.
It looks very hight level again like hugging face.

I would like to know exact concept and what happen under the hood.
Thanks to your video code I could understand stable diffusion ddpm model but I want to expand training concept.

Thank you for amazing works!
Happy new year!

why do we need set causal_mask =True in clip

prompt is a sentence ,we don't need to predict next token in prompt, is there a question to see the right tokens?
x = self.attention(x, causal_mask=True)

RuntimeError: Error(s) in loading state_dict for VAE_Encoder:

In fact, I faced this problem when I run the demo, it seems like the keys after converted cannot be found. What should I do?
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for VAE_Encoder:
Unexpected key(s) in state_dict: "1.groupnorm_1.weight", "1.groupnorm_1.bias", "1.conv_1.weight", "1.conv_1.bias", "1.groupnorm_2.weight", "1.groupnorm_2.bias", "1.conv_2.weight", "1.conv_2.bias", "2.groupnorm_1.weight", "2.groupnorm_1.bias", "2.conv_1.weight", "2.conv_1.bias", "2.groupnorm_2.weight", "2.groupnorm_2.bias", "2.conv_2.weight", "2.conv_2.bias", "4.groupnorm_1.weight", "4.groupnorm_1.bias", "4.conv_1.weight", "4.conv_1.bias", "4.groupnorm_2.weight", "4.groupnorm_2.bias", "4.conv_2.weight", "4.conv_2.bias", "5.groupnorm_1.weight", "5.groupnorm_1.bias", "5.conv_1.weight", "5.conv_1.bias", "5.groupnorm_2.weight", "5.groupnorm_2.bias", "5.conv_2.weight", "5.conv_2.bias", "7.groupnorm_1.weight", "7.groupnorm_1.bias", "7.conv_1.weight", "7.conv_1.bias", "7.groupnorm_2.weight", "7.groupnorm_2.bias", "7.conv_2.weight", "7.conv_2.bias", "8.groupnorm_1.weight", "8.groupnorm_1.bias", "8.conv_1.weight", "8.conv_1.bias", "8.groupnorm_2.weight", "8.groupnorm_2.bias", "8.conv_2.weight", "8.conv_2.bias", "10.groupnorm_1.weight", "10.groupnorm_1.bias", "10.conv_1.weight", "10.conv_1.bias", "10.groupnorm_2.weight", "10.groupnorm_2.bias", "10.conv_2.weight", "10.conv_2.bias", "11.groupnorm_1.weight", "11.groupnorm_1.bias", "11.conv_1.weight", "11.conv_1.bias", "11.groupnorm_2.weight", "11.groupnorm_2.bias", "11.conv_2.weight", "11.conv_2.bias", "12.groupnorm_1.weight", "12.groupnorm_1.bias", "12.conv_1.weight", "12.conv_1.bias", "12.groupnorm_2.weight", "12.groupnorm_2.bias", "12.conv_2.weight", "12.conv_2.bias", "14.groupnorm_1.weight", "14.groupnorm_1.bias", "14.conv_1.weight", "14.conv_1.bias", "14.groupnorm_2.weight", "14.groupnorm_2.bias", "14.conv_2.weight", "14.conv_2.bias".

Not realy issue

Hey Umar Im really grateful for your yt video and github repo for this stable diffusion model. If its not to much to ask is there any way I can talk to you about some errors im getting through discord ?

Issue about in-painting

Thanks a lot for your youtube channel and the code !!! but I have a question about in-painting:
you said in latent space there'll be a combination, but actually no

In https://github.com/CompVis/latent-diffusion, they just show a pre-trained model for in-painting...and i have check the code, no COMBINE in latent space. It seems like they just train a model especially for in-painting and done.

Any response will be appreciated !!!!

hkproj / pytorch-stable-diffusion Goto Github PK

pytorch-stable-diffusion's Issues

Code for training the model

training code

Small Typo

Not an Issue

How to run this project ?

Question about the output of the Unet

Parallel Computing

PDF Slides

Duplicated links

Question about model training.

why do we need set causal_mask =True in clip

RuntimeError: Error(s) in loading state_dict for VAE_Encoder:

Not realy issue

Issue about in-painting

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent