Coder Social home page Coder Social logo

layout_diffuse's People

Contributors

cplusx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

layout_diffuse's Issues

能否冻住unet的权重进行训练?

我在训练模型的时候,发现了unet的权重的require_grad全部是true,这就就非常占用显存,并且fine-tune起来是很慢的,所以我尝试了一下设置pretrained_param的require_grad全部为false,但是我遭遇了RuntimeError: One of the differentiated Tensors does not require grad,请问作者遇到过这个问题吗?或者对此有什么建议吗?

Unable to reproduce the paper reported results

Thank you for your great work! However, I have trouble when reproducing your reported results in your paper (i.e., Table 2 and 4). Therefore, I have some questions and I hope you can help me solve them.

  • According to the README file, two checkpoints are provided, one fine-tuned from SD 2-1 and one for SD1-5. I wonder which one you utilize to report the results in your paper.

  • According to configs/cocostuff_SD2_1.json and configs/cocostuff_SD1_5.json, it seems that you are actually fine-tuning and generating images on 512x512 resolutions instead of 256x256, which is different from your settings in your paper. Moreover, even if I utilize the generated 512x512 using the SD2-1 checkpoint, I cannot get the FID value reported in your paper (I got 22+ FID using the fid_eval.py script). Would you mind providing the exact code settings to reproduce the results in your paper?

"data": {
        "dataset": "coco_stuff_layout_caption_label",
        "root": "/home/ubuntu/disk2/data/COCO",
        "image_size": 512,
        "dataset_args": {
            "train_empty_string": 0, 
            "val_empty_string": 0
        },
        "train_args": {
            "split": "train",
            "data_len": -1
        },
        "val_args": {
            "split": "val",
            "data_len": 1
        },
        "batch_size": 1,
        "val_batch_size": 1
    },
    "sampling_args": {
        "sampling_w_noise": false,
        "image_size": 64,
        "in_channel": 4,
        "num_samples": -1,
        "callbacks": [
            "callbacks.coco_layout.sampling_save_fig.COCOLayoutImageSavingCallback"
        ]
    }

Thank you!

original diffusion vs layout diffuse

Hi author,
Thanks for your excellent work.
I saw the unet code in this repo is different from the UNet2DConditionModel in the diffuser library, but you can load original sd model. I have the following questions.

  1. I would like to learn how you change your code by loading original sd model weights?
  2. I found some zero_module in your code, I think during training, it always makes the output zeros? I don't understand this part.

Thank you.

Got keyerror in sampling

I split the models and run the sampling command but got keyerror as below:

python sampling.py -c configs/cocostuff_SD2_1.json --model_path /path/cocostuff_ldm.ckpt

Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
INFO: initialize denoising UNet from pretrained_models/SD2_1/unet.ckpt, NOTE: without partial attention layers
Traceback (most recent call last):
File "sampling.py", line 43, in
ddpm_model = get_DDPM(
File "/space0/zhujingyuan/layout_diffuse-main/train_utils.py", line 44, in get_DDPM
ddpm_model = DDPM_model(
File "/space0/zhujingyuan/layout_diffuse-main/DDIM_ldm/DDIM_ldm.py", line 555, in init
super().init(
File "/space0/zhujingyuan/layout_diffuse-main/DDIM_ldm/DDIM_ldm.py", line 430, in init
self.initialize_unet(unet_init_weights)
File "/space0/zhujingyuan/layout_diffuse-main/DDIM_ldm/DDIM_ldm_coco.py", line 103, in initialize_unet
self_model_sd[this_k] = model_sd[key_in_foundational_model]
KeyError: 'output_blocks.5.2.conv.weight'

How to solve this problem?

AssertionError: failed to inspect the obj init

python main.py -c configs/cocostuff_SD1_5.json

got the error

Traceback (most recent call last):
File "main.py", line 38, in
ddpm_model = get_DDPM(
File "/opt/data/private/layout_diffuse/train_utils.py", line 44, in get_DDPM
ddpm_model = DDPM_model(
File "/opt/data/private/layout_diffuse/DDIM_ldm/DDIM_ldm.py", line 555, in init
super().init(
File "/opt/data/private/layout_diffuse/DDIM_ldm/DDIM_ldm.py", line 415, in init
super().init(
File "/opt/data/private/layout_diffuse/DDIM_ldm/DDIM_ldm.py", line 234, in init
self.call_save_hyperparameters()
File "/opt/data/private/layout_diffuse/DDIM_ldm/DDIM_ldm.py", line 586, in call_save_hyperparameters
self.save_hyperparameters(ignore=['denoise_fn', 'vqvae_fn', 'text_fn'])
File "/root/anaconda3/lib/python3.8/site-packages/pytorch_lightning/core/mixins/hparams_mixin.py", line 105, in save_hyperparameters
save_hyperparameters(self, *args, ignore=ignore, frame=frame)
File "/root/anaconda3/lib/python3.8/site-packages/pytorch_lightning/utilities/parsing.py", line 224, in save_hyperparameters
assert init_args, "failed to inspect the obj init"
AssertionError: failed to inspect the obj init

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.