zhaobozb / layout2im Goto Github PK

View Code? Open in Web Editor NEW

187.0 187.0 28.0 4.89 MB

Official PyTorch Implementation of Image Generation from Layout - CVPR 2019

Home Page: https://layout2im.cs.ubc.ca

License: Apache License 2.0

Python 97.77% Shell 2.23%

layout2im's People

Contributors

Stargazers

Watchers

layout2im's Issues

Really Brilliant Work

The code writting is SO COMFORTABLE for me, very brilliant work!

Shape mismatch while trying to use model to generate image.

I got a shape mismatch trying to use the model to generate an image from a layout.

Model input:

# Build up a model

# Images 256 x 256
h, w = 256, 256
# Number of objects
o = 2

objects = torch.randint(179, [1]).repeat(o)
boxes = torch.rand([1, 4]).repeat(o, 1)
masks = torch.randint(1, [1, 1, h, w]).repeat(o, 1, 1, 1)
obj_to_img = torch.tensor([0]).repeat(o)

z_rand = torch.randn([1, 64]).repeat(o, 1)

model(objects, boxes, masks, obj_to_img, z_rand)

A slightly modified version of your Generator model. this one, the ImageGenerator doesn't require ground truth images to provide.

class ImageGenerator(nn.Module):
    def __init__(self, num_embeddings, embedding_dim=64, z_dim=8, obj_size=32, clstm_layers=3):
        super(ImageGenerator, self).__init__()

        self.obj_size = obj_size

        # (3, 32, 32) -> (256, 4, 4) -> 8
        self.crop_encoder = CropEncoder(z_dim=z_dim, class_num=num_embeddings)
        self.layout_encoder = LayoutEncoder(
            z_dim=z_dim,
            embedding_dim=embedding_dim,
            class_num=num_embeddings,
            clstm_layers=clstm_layers
        )
        self.decoder = Decoder()

    def forward(self, objs, boxes, masks, obj_to_img, z_rand):
        # (n, clstm_dim * 2, 8, 8)
        h_rand = self.layout_encoder(objs, masks, obj_to_img, z_rand)

        img_rand = self.decoder(h_rand)

        crops_rand = crop_bbox_batch(
            img_rand, boxes, obj_to_img, self.obj_size)
        _, z_rand_rec, _ = self.crop_encoder(crops_rand, objs)

        return crops_rand, img_rand, z_rand_rec

It says:

RuntimeError: Sizes of tensors must match except in dimension 1. Got 32 and 8 in dimension 2 (The offending index is 1)

On the line:

layout2im/models/generator.py

Line 98 in 031f65b

    
           combined = torch.cat([input_tensor, h_cur], dim=1)  # concatenate along channel axis

Questions regarding LPIPS

Hi Bo,

Thank you for sharing your nice work.

Q1. Could you explain what +_ means in LPIPS?

Is it same as Inception Score? (stddev across multiple batches?)
Stddev of distances between multiple samples?
Something else?

Q2. Could you share your code for computing LPIPS?

I modified

layout2im/test.py

Lines 38 to 44 in 031f65b

    
           imgs, objs, boxes, masks, obj_to_img = batch 
        
           z = torch.randn(objs.size(0), config.z_dim) 
        
           imgs, objs, boxes, masks, obj_to_img, z = imgs.to(device), objs.to(device), boxes.to(device), masks.to(device), obj_to_img, z.to(device) 
        
           # Generate fake image 
        
           output = netG(imgs, objs, boxes, masks, obj_to_img, z) 
        
           crops_input, crops_input_rec, crops_rand, img_rec, img_rand, mu, logvar, z_rand_rec = output

as below

imgs, objs, boxes, masks, obj_to_img = batch
imgs, objs, boxes, masks, obj_to_img = imgs.to(device), objs.to(device), boxes.to(device), masks.to(device), obj_to_img
for zi in range(config.num_multimodal):
    z = torch.randn(objs.size(0), config.z_dim).to(device)

    # Generate fake image
    outputs = netG(imgs, objs, boxes, masks, obj_to_img, z)
    crops_input, crops_input_rec, crops_rand, img_rec, img_rand, mu, logvar, z_rand_rec = outputs

and computed LPIPS with official implementation but I could hardly reproduce numbers in the paper.
I got 0.057.

Any help would be highly appreciated.

Best,
Youngjung

How did you get your is score and fid?

Hi zhaobozb, thanks for giving the code and the awesome work. When i run the code, i got some problems about metrics. When i trying to run the code that you are describing in the github. I followed the most your description except the batch_size and iteration time ,i set the batch_size to 16 and iteration 150000, the GPU i used is 1080Ti, but unfortunately my result is not matching with yours in the paper.While you tested the IS score 9.1 points for me the result is only have 1.2. Did i got something wrong?Can you give me the metrics code or do you have some advice for me how i can get a bigger result?

How to split coco-stuff val into 1024 val and 2048 test

Thanks for your impressive work! I have the following questions:

How to split coco-stuff val into 1024 val and 2048 test
when calculating FID, do you generate 2048x5 fake images and compare with 2048x1 real images?

How to generate the outputs in the various layout structure.

Hi bo:
Can you share the code for generating outputs in the various layout structures like written in your paper.
Best regards

padding in self.c1 of LayoutEncoder should be 0

Hi, bo,

first thanks for your brilliant work, but in my opinion, the padding in self.c1 of LayoutEncoder should be 0. Although it may have no effect with padding=1.

How did you calculate the FID

Hi, thanks for the fantastic work and sharing the code. I run the pertrained model, and calculated the FID, but did not get FID as your cvpr version (31 for vg, I got 39). Did I do something wrong? How many images did you use for calculating FID? What about image size?

Webpage not working

Hi!

I'm getting a 502 Bad Gateway when trying to access the online demo.

Could you please fix it?

Kind regards,

Carlos

How to get diverse result

Problem with the reconstruction loss

Hi Bo,

first of all thank you for providing the code to the paper - I found it really good to work with. However, running your training script code for the coco dataset (without changing the default values) I run into a problem. The reconstructed real images look just like the fake generated ones, but not like the real original images.
I uploaded the tensorboard output here:
Unbenanntes Dokument(1).pdf

I also tried to increase the reconstruction loss - which helped in the sense that real_rec and fake (random) do not look so similar anymore, but the reconstruction still does not work very well.
Do you have an idea what could be the problem here?

Best,
Katja

How did you get your IS and FID

Hi zhaobozb, thanks for giving the code and the awesome work. When i run the code, i got some problems about metrics. When i trying to run the code that you are describing in the github. I followed the whole your description about hyperparameters . I tested your trained models on 2080Ti and 1080Ti respectively, but unfortunately my result is not matching with yours in the paper.While you tested the IS score and FID are 9.1 points and 38.14 on COCO-stuff for me the result are only have 9.1 and 42.8. The IS score is 8.1 points on VG dataset for me the result is 7.8. Did i got something wrong?Do you have some advice for me how i can get a bigger result?
Thanks, best wishes!

	imgs, objs, boxes, masks, obj_to_img = batch
	z = torch.randn(objs.size(0), config.z_dim)
	imgs, objs, boxes, masks, obj_to_img, z = imgs.to(device), objs.to(device), boxes.to(device), masks.to(device), obj_to_img, z.to(device)

	# Generate fake image
	output = netG(imgs, objs, boxes, masks, obj_to_img, z)
	crops_input, crops_input_rec, crops_rand, img_rec, img_rand, mu, logvar, z_rand_rec = output

zhaobozb / layout2im Goto Github PK

layout2im's People

Contributors

Stargazers

Watchers

Forkers

layout2im's Issues

Q1. Could you explain what +_ means in LPIPS?

Q2. Could you share your code for computing LPIPS?

Recommend Projects

Recommend Topics

Recommend Org