Coder Social home page Coder Social logo

yuanzhi-zhu / diffpir Goto Github PK

View Code? Open in Web Editor NEW
321.0 321.0 22.0 92.13 MB

"Denoising Diffusion Models for Plug-and-Play Image Restoration", Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, Luc Van Gool.

Home Page: https://yuanzhi-zhu.github.io/DiffPIR/

License: MIT License

Python 99.89% Shell 0.11%

diffpir's People

Contributors

yuanzhi-zhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diffpir's Issues

how to run Image Restoration Code?

I am confused when I run the main_ddpir_sisr.py script, the generated result is images with noise. So how should the restoration code run?

Advice on inference for different resolutions?

Good day,

Thank you very much for your work. I would like to know if you could provide advice on applying the models for resolutions other than 256x256, for example for larger images. Are we limited because of the pretrained diffusion models?

Thanks in advance!

How about unknown complex degradation type?

Hi, thanks for the great work first :)

There are 3 types of degradations mentioned in the paper, which have explicit expressions as suggested by Eq. 25, 27, 29. In real world, the degradation might be complex and unclear, I wonder how to deal with such degradation? could you please give some insights? thanks in advance!

How to calculate FID?

Thanks for your contribution, it is wonderful. Your article shows the average PSNR (dB), FID and LPIPS of different methods on Gaussian deblurring, motion deblurring and 4× SR, but I can't find the calculation of FID in your codes.I would appreciate it if you could tell me.

some strange things when running main_ddpir_inpainting.py

I'm sorry to trouble you, I'm a student. I have some questions to ask you for advice. How long does it take to run main_ddpir_inpainting.py? I wonder why when I follow the instructions to run this program, the terminal shows that there are no processes and also there are no errors during debugging.
1
2
3

hi,I tried to reason with my own photo when it showed this.

E:\DiffPIR>python main_ddpir_sisr.py
LogHandlers setup!
23-07-30 10:47:38.869 : model_name:diffusion_ffhq_10m, sr_mode:blur, image sigma:0.050, model sigma:0.050
23-07-30 10:47:38.871 : eta:0.000, zeta:0.100, lambda:1.000, guidance_scale:1.00
23-07-30 10:47:38.871 : start step:999, skip_type:quad, skip interval:10, skipstep analytic steps:0
23-07-30 10:47:38.871 : analytic iter num:1, gamma:0.01
23-07-30 10:47:38.871 : Model path: model_zoo\diffusion_ffhq_10m.pt
23-07-30 10:47:38.871 : C:\Users\weber\Desktop\my_lq
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
C:\Users\weber\AppData\Local\Programs\Python\Python310\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
C:\Users\weber\AppData\Local\Programs\Python\Python310\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=VGG16_Weights.IMAGENET1K_V1. You can also use weights=VGG16_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
Loading model from: C:\Users\weber\AppData\Local\Programs\Python\Python310\lib\site-packages\lpips\weights\v0.1\vgg.pth
23-07-30 10:47:39.953 : --------- sf:4 --k: 0 ---------
23-07-30 10:47:39.955 : eta:0.000, zeta:0.250, lambda:2.000, inIter:1.000, gamma:0.010, guidance_scale:1.00
Traceback (most recent call last):
File "E:\DiffPIR\main_ddpir_sisr.py", line 502, in
main()
File "E:\DiffPIR\main_ddpir_sisr.py", line 485, in main
test_results_ave = test_rho(lambda_, zeta=zeta_i, model_output_type=model_output_type)
File "E:\DiffPIR\main_ddpir_sisr.py", line 298, in test_rho
x0 = utils_model.model_fn(x, noise_level=curr_sigma*255, model_out_type=model_out_type,
File "E:\DiffPIR\utils\utils_model.py", line 221, in model_fn
out = diffusion.p_sample(
File "E:\DiffPIR\guided_diffusion\gaussian_diffusion.py", line 422, in p_sample
out = self.p_mean_variance(
File "E:\DiffPIR\guided_diffusion\respace.py", line 91, in p_mean_variance
return super().p_mean_variance(self._wrap_model(model), *args, **kwargs)
File "E:\DiffPIR\guided_diffusion\gaussian_diffusion.py", line 260, in p_mean_variance
model_output = model(x, self._scale_timesteps(t), **model_kwargs)
File "E:\DiffPIR\guided_diffusion\respace.py", line 128, in call
return self.model(x, new_ts, **kwargs)
File "C:\Users\weber\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "E:\DiffPIR\guided_diffusion\unet.py", line 660, in forward
h = th.cat([h, hs.pop()], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 24 but got size 25 for tensor number 1 in the list.

Questions on switching other models with your sampling method

Dear Authors,
Thank you so much for your great work, as far as I am concerned, you only tried two types of pre-trained diffusion models in your work. Could you please shed some lights on ways to use other pretrained diffusion models (SD-v2.1 etc.) with your sampling method?
Thank you again!

Batch-wise inference (multiple images in parallel) support

Hi authors, thank you for this amazing work!
I notice that in 'main_ddpir_inpainting.py', there is a for loop that deal with one image at a time. If we have a lot of images to deal with, this may be relatively slow. I wonder if it's possible to process multiple images as a minibatch in parallel? Thank you!!

FFHQ 256×256 and ImageNet 256×256 validation dataset

Dear Authors,

Thanks for your great work and for sharing your code!

I am currently working on reproducing the results, and I wanted to ask how you sampled the 100 images for evaluation (if possible, which specific samples were used) and how they were pre-processed to obtain the final 256x256 HR images for each dataset.

Thank you.

from motionblur.motionblur import Kernel

您好,我用pycharm运行该项目时,在utils目录下的utils_deblur.py文件中发现from motionblur.motionblur import Kernel报错,显示没有该模块,在各大镜像源上并没有找到该模块,请问能提供该模块的下载渠道或者指点一下是我操作有问题么,谢谢。

A few questions

Congrats on the fantastic work. I was looking into your code to understand it and had a few questions.

  1. In some of the steps I noticed that the data is divided by 2 and then 0.5 was added. What was the reason for that?
  2. I was wondering exactly in which step the data consistency is performed. Is it in the sr.data_solution function? Also how is it different than the model_fn function in utils_model?

Bad results on SISR

Thank you very much for the work. The generated SR images don't match the LR images when I run main_ddpir_sisr.py. Is there anything else I should be aware of? Can you help me?
00000_x4_k2_LEH

Comparisons with DDRM - Operators

Hello,

That is a very interesting work, and a very tidy code, well done!

I would like to ask what changes needed to be made in the DDRM code so that DiffPIR's operators and DDRM's operators can match properly. What I find is that if I create a noisy image with the DiffPIR code (let's say by creating a Gaussian blur of size 9x9 and st. deviation 3) and a noisy image with the DDRM code (again the same operator but with their zero-padded strategy code), the DDRM's noisy image is a more challenging one.

I am trying to compare the two algorithms in the most fair way, so I am not sure if the operators are the same. Could you help me on that please? How did you make the comparisons in the paper towards this?

Some questions about the implemented equations in the codes

Thanks for sharing the great work!
I can run the code successfully, but I'm confused with some implementations in main_ddpir_deblur.py and respectfully request your help.


noise_model_t = utils_model.find_nearest(reduced_alpha_cumprod, 2 * noise_level_model)
noise_model_t = 0
noise_inti_img = 50 / 255
t_start = utils_model.find_nearest(reduced_alpha_cumprod, 2 * noise_inti_img) # start timestep of the diffusion process
t_start = num_train_timesteps - 1

In this part, the variables noise_model_t and t_start are overrided. The initial setting seems to determine the start point for the reverse diffusion (instead of directly from the random noise), but why override them. (I tried the original setting and the performance got worse)


t_im1 = utils_model.find_nearest(reduced_alpha_cumprod,sigmas[seq[i+1]].cpu().numpy())
# calculate \hat{\eposilon}
eps = (x - sqrt_alphas_cumprod[t_i] * x0) / sqrt_1m_alphas_cumprod[t_i]
eta_sigma = eta * sqrt_1m_alphas_cumprod[t_im1] / sqrt_1m_alphas_cumprod[t_i] * torch.sqrt(betas[t_i])
x = sqrt_alphas_cumprod[t_im1] * x0 + np.sqrt(1-zeta) * (torch.sqrt(sqrt_1m_alphas_cumprod[t_im1]**2 - eta_sigma**2) * eps \
+ eta_sigma * torch.randn_like(x)) + np.sqrt(zeta) * sqrt_1m_alphas_cumprod[t_im1] * torch.randn_like(x)

These lines are corresponding to Eq.14-15 in the paper, i.e.
image

  1. I didn't figure out which equation eta_sigma in Line.345 corresponses to in the paper. It looks like $\sigma_t$ in Eq.12 (for DDPM case) of the DDIM paper, but not the same.

  2. In Line 346-347, the equation seems to be a combination of Eq.14 and Eq.15, i.e., it contains both $\sigma_{\eta_t}$ and $\zeta$. I know $\eta$ is set to 0 (thus $\sigma_{\eta_t}=0$, this line corresponse to Eq.15 ) in this code, but I'm confused about the equation's original meaning.

dehazing

Hello, have you tried to dehazing? How did it work?

Can this method Achieve face-mask realistic inpainting ?

Hello , How are you Mr. author ! thank you for being active and answering all the questions!

DiffPIR is so good at Face Depixelization or also what can be referred to as Restoration and it is also really good at simple inpainting tasks with animal pics.
I was wondering if it can be trained or solely fine-tuned to do Face-mask inpainting like (https://github.com/andreas128/RePaint) and even if it can get better results than Repaint after the training or the fine-tuning ?
and if so , how can I do it ?

for reference, so it can do tasks like that and replace the mask with very accurate face
228623149-b1ddddc1-41e5-487a-98e9-37c8c6ec6cf2

How to determine the value of σn in Algorithm 1

In the code, σn is generated by 'sigmas.append(reduced_alpha_cumprod[num_train_timesteps-1-i])', but I can't find the relevant description from the paper. Can you tell me the reason? Thank you very much.

Inquiries from a Beginner

Hello, I am a beginner and English is not my native language, please pardon any incorrect descriptions. I have a few questions. Firstly, as far as I know, Google's SR3 super-resolution model also applies diffusion models to image restoration. I would like to know the differences between this model and SR3, as well as the overall architecture. What are the core innovations? Secondly, what efforts do I need to make if I want to input 512*512 images? Thank you very much for your answers!

The main_ddpir_sisr.py script encounters an error when attempting to load the ffhq_10m model.

RuntimeError: Error(s) in loading state_dict for UNetModel:
Missing key(s) in state_dict: "input_blocks.7.0.skip_connection.weight", "input_blocks.7.0.skip_connection.bias", "input_blocks.10.1.norm.weight", "input_blocks.10.1.norm.bias", "input_blocks.10.1.qkv.weight", "input_blocks.10.1.qkv.bias", "input_blocks.10.1.proj_out.weight", "input_blocks.10.1.proj_out.bias", "input_blocks.11.1.norm.weight", "input_blocks.11.1.norm.bias", "input_blocks.11.1.qkv.weight", "input_blocks.11.1.qkv.bias", "input_blocks.11.1.proj_out.weight", "input_blocks.11.1.proj_out.bias", "input_blocks.12.0.in_layers.0.weight", "input_blocks.12.0.in_layers.0.bias", "input_blocks.12.0.in_layers.2.weight", "input_blocks.12.0.in_layers.2.bias", "input_blocks.12.0.emb_layers.1.weight", "input_blocks.12.0.emb_layers.1.bias", "input_blocks.12.0.out_layers.0.weight", "input_blocks.12.0.out_layers.0.bias", "input_blocks.12.0.out_layers.3.weight", "input_blocks.12.0.out_layers.3.bias", "input_blocks.13.0.in_layers.0.weight", "input_blocks.13.0.in_layers.0.bias", "input_blocks.13.0.in_layers.2.weight", "input_blocks.13.0.in_layers.2.bias", "input_blocks.13.0.emb_layers.1.weight", "input_blocks.13.0.emb_layers.1.bias", "input_blocks.13.0.out_layers.0.weight", "input_blocks.13.0.out_layers.0.bias", "input_blocks.13.0.out_layers.3.weight", "input_blocks.13.0.out_layers.3.bias", "input_blocks.13.0.skip_connection.weight", "input_blocks.13.0.skip_connection.bias", "input_blocks.13.1.norm.weight", "input_blocks.13.1.norm.bias", "input_blocks.13.1.qkv.weight", "input_blocks.13.1.qkv.bias", "input_blocks.13.1.proj_out.weight", "input_blocks.13.1.proj_out.bias", "input_blocks.14.0.in_layers.0.weight", "input_blocks.14.0.in_layers.0.bias", "input_blocks.14.0.in_layers.2.weight", "input_blocks.14.0.in_layers.2.bias", "input_blocks.14.0.emb_layers.1.weight", "input_blocks.14.0.emb_layers.1.bias", "input_blocks.14.0.out_layers.0.weight", "input_blocks.14.0.out_layers.0.bias", "input_blocks.14.0.out_layers.3.weight", "input_blocks.14.0.out_layers.3.bias", "input_blocks.14.1.norm.weight", "input_blocks.14.1.norm.bias", "input_blocks.14.1.qkv.weight", "input_blocks.14.1.qkv.bias", "input_blocks.14.1.proj_out.weight", "input_blocks.14.1.proj_out.bias", "input_blocks.15.0.in_layers.0.weight", "input_blocks.15.0.in_layers.0.bias", "input_blocks.15.0.in_layers.2.weight", "input_blocks.15.0.in_layers.2.bias", "input_blocks.15.0.emb_layers.1.weight", "input_blocks.15.0.emb_layers.1.bias", "input_blocks.15.0.out_layers.0.weight", "input_blocks.15.0.out_layers.0.bias", "input_blocks.15.0.out_layers.3.weight", "input_blocks.15.0.out_layers.3.bias", "input_blocks.16.0.in_layers.0.weight", "input_blocks.16.0.in_layers.0.bias", "input_blocks.16.0.in_layers.2.weight", "input_blocks.16.0.in_layers.2.bias", "input_blocks.16.0.emb_layers.1.weight", "input_blocks.16.0.emb_layers.1.bias", "input_blocks.16.0.out_layers.0.weight", "input_blocks.16.0.out_layers.0.bias", "input_blocks.16.0.out_layers.3.weight", "input_blocks.16.0.out_layers.3.bias", "input_blocks.16.1.norm.weight", "input_blocks.16.1.norm.bias", "input_blocks.16.1.qkv.weight", "input_blocks.16.1.qkv.bias", "input_blocks.16.1.proj_out.weight", "input_blocks.16.1.proj_out.bias", "input_blocks.17.0.in_layers.0.weight", "input_blocks.17.0.in_layers.0.bias", "input_blocks.17.0.in_layers.2.weight", "input_blocks.17.0.in_layers.2.bias", "input_blocks.17.0.emb_layers.1.weight", "input_blocks.17.0.emb_layers.1.bias", "input_blocks.17.0.out_layers.0.weight", "input_blocks.17.0.out_layers.0.bias", "input_blocks.17.0.out_layers.3.weight", "input_blocks.17.0.out_layers.3.bias", "input_blocks.17.1.norm.weight", "input_blocks.17.1.norm.bias", "input_blocks.17.1.qkv.weight", "input_blocks.17.1.qkv.bias", "input_blocks.17.1.proj_out.weight", "input_blocks.17.1.proj_out.bias", "output_blocks.0.1.norm.weight", "output_blocks.0.1.norm.bias", "output_blocks.0.1.qkv.weight", "output_blocks.0.1.qkv.bias", "output_blocks.0.1.proj_out.weight", "output_blocks.0.1.proj_out.bias", "output_blocks.1.1.norm.weight", "output_blocks.1.1.norm.bias", "output_blocks.1.1.qkv.weight", "output_blocks.1.1.qkv.bias", "output_blocks.1.1.proj_out.weight", "output_blocks.1.1.proj_out.bias", "output_blocks.2.2.in_layers.0.weight", "output_blocks.2.2.in_layers.0.bias", "output_blocks.2.2.in_layers.2.weight", "output_blocks.2.2.in_layers.2.bias", "output_blocks.2.2.emb_layers.1.weight", "output_blocks.2.2.emb_layers.1.bias", "output_blocks.2.2.out_layers.0.weight", "output_blocks.2.2.out_layers.0.bias", "output_blocks.2.2.out_layers.3.weight", "output_blocks.2.2.out_layers.3.bias", "output_blocks.4.1.norm.weight", "output_blocks.4.1.norm.bias", "output_blocks.4.1.qkv.weight", "output_blocks.4.1.qkv.bias", "output_blocks.4.1.proj_out.weight", "output_blocks.4.1.proj_out.bias", "output_blocks.5.1.norm.weight", "output_blocks.5.1.norm.bias", "output_blocks.5.1.qkv.weight", "output_blocks.5.1.qkv.bias", "output_blocks.5.1.proj_out.weight", "output_blocks.5.1.proj_out.bias", "output_blocks.5.2.in_layers.0.weight", "output_blocks.5.2.in_layers.0.bias", "output_blocks.5.2.in_layers.2.weight", "output_blocks.5.2.in_layers.2.bias", "output_blocks.5.2.emb_layers.1.weight", "output_blocks.5.2.emb_layers.1.bias", "output_blocks.5.2.out_layers.0.weight", "output_blocks.5.2.out_layers.0.bias", "output_blocks.5.2.out_layers.3.weight", "output_blocks.5.2.out_layers.3.bias", "output_blocks.6.1.norm.weight", "output_blocks.6.1.norm.bias", "output_blocks.6.1.qkv.weight", "output_blocks.6.1.qkv.bias", "output_blocks.6.1.proj_out.weight", "output_blocks.6.1.proj_out.bias", "output_blocks.7.1.norm.weight", "output_blocks.7.1.norm.bias", "output_blocks.7.1.qkv.weight", "output_blocks.7.1.qkv.bias", "output_blocks.7.1.proj_out.weight", "output_blocks.7.1.proj_out.bias", "output_blocks.8.1.norm.weight", "output_blocks.8.1.norm.bias", "output_blocks.8.1.qkv.weight", "output_blocks.8.1.qkv.bias", "output_blocks.8.1.proj_out.weight", "output_blocks.8.1.proj_out.bias", "output_blocks.8.2.in_layers.0.weight", "output_blocks.8.2.in_layers.0.bias", "output_blocks.8.2.in_layers.2.weight", "output_blocks.8.2.in_layers.2.bias", "output_blocks.8.2.emb_layers.1.weight", "output_blocks.8.2.emb_layers.1.bias", "output_blocks.8.2.out_layers.0.weight", "output_blocks.8.2.out_layers.0.bias", "output_blocks.8.2.out_layers.3.weight", "output_blocks.8.2.out_layers.3.bias", "output_blocks.11.1.in_layers.0.weight", "output_blocks.11.1.in_layers.0.bias", "output_blocks.11.1.in_layers.2.weight", "output_blocks.11.1.in_layers.2.bias", "output_blocks.11.1.emb_layers.1.weight", "output_blocks.11.1.emb_layers.1.bias", "output_blocks.11.1.out_layers.0.weight", "output_blocks.11.1.out_layers.0.bias", "output_blocks.11.1.out_layers.3.weight", "output_blocks.11.1.out_layers.3.bias", "output_blocks.12.0.in_layers.0.weight", "output_blocks.12.0.in_layers.0.bias", "output_blocks.12.0.in_layers.2.weight", "output_blocks.12.0.in_layers.2.bias", "output_blocks.12.0.emb_layers.1.weight", "output_blocks.12.0.emb_layers.1.bias", "output_blocks.12.0.out_layers.0.weight", "output_blocks.12.0.out_layers.0.bias", "output_blocks.12.0.out_layers.3.weight", "output_blocks.12.0.out_layers.3.bias", "output_blocks.12.0.skip_connection.weight", "output_blocks.12.0.skip_connection.bias", "output_blocks.13.0.in_layers.0.weight", "output_blocks.13.0.in_layers.0.bias", "output_blocks.13.0.in_layers.2.weight", "output_blocks.13.0.in_layers.2.bias", "output_blocks.13.0.emb_layers.1.weight", "output_blocks.13.0.emb_layers.1.bias", "output_blocks.13.0.out_layers.0.weight", "output_blocks.13.0.out_layers.0.bias", "output_blocks.13.0.out_layers.3.weight", "output_blocks.13.0.out_layers.3.bias", "output_blocks.13.0.skip_connection.weight", "output_blocks.13.0.skip_connection.bias", "output_blocks.14.0.in_layers.0.weight", "output_blocks.14.0.in_layers.0.bias", "output_blocks.14.0.in_layers.2.weight", "output_blocks.14.0.in_layers.2.bias", "output_blocks.14.0.emb_layers.1.weight", "output_blocks.14.0.emb_layers.1.bias", "output_blocks.14.0.out_layers.0.weight", "output_blocks.14.0.out_layers.0.bias", "output_blocks.14.0.out_layers.3.weight", "output_blocks.14.0.out_layers.3.bias", "output_blocks.14.0.skip_connection.weight", "output_blocks.14.0.skip_connection.bias", "output_blocks.14.1.in_layers.0.weight", "output_blocks.14.1.in_layers.0.bias", "output_blocks.14.1.in_layers.2.weight", "output_blocks.14.1.in_layers.2.bias", "output_blocks.14.1.emb_layers.1.weight", "output_blocks.14.1.emb_layers.1.bias", "output_blocks.14.1.out_layers.0.weight", "output_blocks.14.1.out_layers.0.bias", "output_blocks.14.1.out_layers.3.weight", "output_blocks.14.1.out_layers.3.bias", "output_blocks.15.0.in_layers.0.weight", "output_blocks.15.0.in_layers.0.bias", "output_blocks.15.0.in_layers.2.weight", "output_blocks.15.0.in_layers.2.bias", "output_blocks.15.0.emb_layers.1.weight", "output_blocks.15.0.emb_layers.1.bias", "output_blocks.15.0.out_layers.0.weight", "output_blocks.15.0.out_layers.0.bias", "output_blocks.15.0.out_layers.3.weight", "output_blocks.15.0.out_layers.3.bias", "output_blocks.15.0.skip_connection.weight", "output_blocks.15.0.skip_connection.bias", "output_blocks.16.0.in_layers.0.weight", "output_blocks.16.0.in_layers.0.bias", "output_blocks.16.0.in_layers.2.weight", "output_blocks.16.0.in_layers.2.bias", "output_blocks.16.0.emb_layers.1.weight", "output_blocks.16.0.emb_layers.1.bias", "output_blocks.16.0.out_layers.0.weight", "output_blocks.16.0.out_layers.0.bias", "output_blocks.16.0.out_layers.3.weight", "output_blocks.16.0.out_layers.3.bias", "output_blocks.16.0.skip_connection.weight", "output_blocks.16.0.skip_connection.bias", "output_blocks.17.0.in_layers.0.weight", "output_blocks.17.0.in_layers.0.bias", "output_blocks.17.0.in_layers.2.weight", "output_blocks.17.0.in_layers.2.bias", "output_blocks.17.0.emb_layers.1.weight", "output_blocks.17.0.emb_layers.1.bias", "output_blocks.17.0.out_layers.0.weight", "output_blocks.17.0.out_layers.0.bias", "output_blocks.17.0.out_layers.3.weight", "output_blocks.17.0.out_layers.3.bias", "output_blocks.17.0.skip_connection.weight", "output_blocks.17.0.skip_connection.bias".
Unexpected key(s) in state_dict: "input_blocks.5.0.skip_connection.weight", "input_blocks.5.0.skip_connection.bias", "input_blocks.9.1.norm.weight", "input_blocks.9.1.norm.bias", "input_blocks.9.1.qkv.weight", "input_blocks.9.1.qkv.bias", "input_blocks.9.1.proj_out.weight", "input_blocks.9.1.proj_out.bias", "input_blocks.9.0.skip_connection.weight", "input_blocks.9.0.skip_connection.bias", "output_blocks.1.1.in_layers.0.weight", "output_blocks.1.1.in_layers.0.bias", "output_blocks.1.1.in_layers.2.weight", "output_blocks.1.1.in_layers.2.bias", "output_blocks.1.1.emb_layers.1.weight", "output_blocks.1.1.emb_layers.1.bias", "output_blocks.1.1.out_layers.0.weight", "output_blocks.1.1.out_layers.0.bias", "output_blocks.1.1.out_layers.3.weight", "output_blocks.1.1.out_layers.3.bias", "output_blocks.3.2.in_layers.0.weight", "output_blocks.3.2.in_layers.0.bias", "output_blocks.3.2.in_layers.2.weight", "output_blocks.3.2.in_layers.2.bias", "output_blocks.3.2.emb_layers.1.weight", "output_blocks.3.2.emb_layers.1.bias", "output_blocks.3.2.out_layers.0.weight", "output_blocks.3.2.out_layers.0.bias", "output_blocks.3.2.out_layers.3.weight", "output_blocks.3.2.out_layers.3.bias", "output_blocks.5.1.in_layers.0.weight", "output_blocks.5.1.in_layers.0.bias", "output_blocks.5.1.in_layers.2.weight", "output_blocks.5.1.in_layers.2.bias", "output_blocks.5.1.emb_layers.1.weight", "output_blocks.5.1.emb_layers.1.bias", "output_blocks.5.1.out_layers.0.weight", "output_blocks.5.1.out_layers.0.bias", "output_blocks.5.1.out_layers.3.weight", "output_blocks.5.1.out_layers.3.bias", "output_blocks.7.1.in_layers.0.weight", "output_blocks.7.1.in_layers.0.bias", "output_blocks.7.1.in_layers.2.weight", "output_blocks.7.1.in_layers.2.bias", "output_blocks.7.1.emb_layers.1.weight", "output_blocks.7.1.emb_layers.1.bias", "output_blocks.7.1.out_layers.0.weight", "output_blocks.7.1.out_layers.0.bias", "output_blocks.7.1.out_layers.3.weight", "output_blocks.7.1.out_layers.3.bias", "output_blocks.9.1.in_layers.0.weight", "output_blocks.9.1.in_layers.0.bias", "output_blocks.9.1.in_layers.2.weight", "output_blocks.9.1.in_layers.2.bias", "output_blocks.9.1.emb_layers.1.weight", "output_blocks.9.1.emb_layers.1.bias", "output_blocks.9.1.out_layers.0.weight", "output_blocks.9.1.out_layers.0.bias", "output_blocks.9.1.out_layers.3.weight", "output_blocks.9.1.out_layers.3.bias".
size mismatch for time_embed.0.weight: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([1024, 256]).

Formula realization

x = sqrt_alphas_cumprod[t_start] * (2*x-1) + sqrt_1m_alphas_cumprod[t_start] * torch.randn_like(x)

Is the code here corresponding to that formula? Looking forward to your reply.

for RuntimeError

您好,感谢您百忙之中阅读评论。
我在运行代码时更换了测试集,出现如下报错
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 66 but got size 67 for tensor number 1 in the list.
想请教一下是什么原因?

model name error

rename "ffhq_10m" to "diffusion_ffhq_m" in order to test images.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.