Coder Social home page Coder Social logo

zhaobozb / layout2im Goto Github PK

View Code? Open in Web Editor NEW
187.0 187.0 28.0 4.89 MB

Official PyTorch Implementation of Image Generation from Layout - CVPR 2019

Home Page: https://layout2im.cs.ubc.ca

License: Apache License 2.0

Python 97.77% Shell 2.23%

layout2im's People

Contributors

zhaobozb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

layout2im's Issues

Shape mismatch while trying to use model to generate image.

I got a shape mismatch trying to use the model to generate an image from a layout.

Model input:

# Build up a model

# Images 256 x 256
h, w = 256, 256
# Number of objects
o = 2

objects = torch.randint(179, [1]).repeat(o)
boxes = torch.rand([1, 4]).repeat(o, 1)
masks = torch.randint(1, [1, 1, h, w]).repeat(o, 1, 1, 1)
obj_to_img = torch.tensor([0]).repeat(o)

z_rand = torch.randn([1, 64]).repeat(o, 1)

model(objects, boxes, masks, obj_to_img, z_rand)

A slightly modified version of your Generator model. this one, the ImageGenerator doesn't require ground truth images to provide.

class ImageGenerator(nn.Module):
    def __init__(self, num_embeddings, embedding_dim=64, z_dim=8, obj_size=32, clstm_layers=3):
        super(ImageGenerator, self).__init__()

        self.obj_size = obj_size

        # (3, 32, 32) -> (256, 4, 4) -> 8
        self.crop_encoder = CropEncoder(z_dim=z_dim, class_num=num_embeddings)
        self.layout_encoder = LayoutEncoder(
            z_dim=z_dim,
            embedding_dim=embedding_dim,
            class_num=num_embeddings,
            clstm_layers=clstm_layers
        )
        self.decoder = Decoder()

    def forward(self, objs, boxes, masks, obj_to_img, z_rand):
        # (n, clstm_dim * 2, 8, 8)
        h_rand = self.layout_encoder(objs, masks, obj_to_img, z_rand)

        img_rand = self.decoder(h_rand)

        crops_rand = crop_bbox_batch(
            img_rand, boxes, obj_to_img, self.obj_size)
        _, z_rand_rec, _ = self.crop_encoder(crops_rand, objs)

        return crops_rand, img_rand, z_rand_rec

It says:

RuntimeError: Sizes of tensors must match except in dimension 1. Got 32 and 8 in dimension 2 (The offending index is 1)

On the line:

combined = torch.cat([input_tensor, h_cur], dim=1) # concatenate along channel axis

Questions regarding LPIPS

Hi Bo,

Thank you for sharing your nice work.

Q1. Could you explain what +_ means in LPIPS?

  • Is it same as Inception Score? (stddev across multiple batches?)
  • Stddev of distances between multiple samples?
  • Something else?

Q2. Could you share your code for computing LPIPS?

I modified

layout2im/test.py

Lines 38 to 44 in 031f65b

imgs, objs, boxes, masks, obj_to_img = batch
z = torch.randn(objs.size(0), config.z_dim)
imgs, objs, boxes, masks, obj_to_img, z = imgs.to(device), objs.to(device), boxes.to(device), masks.to(device), obj_to_img, z.to(device)
# Generate fake image
output = netG(imgs, objs, boxes, masks, obj_to_img, z)
crops_input, crops_input_rec, crops_rand, img_rec, img_rand, mu, logvar, z_rand_rec = output

as below

imgs, objs, boxes, masks, obj_to_img = batch
imgs, objs, boxes, masks, obj_to_img = imgs.to(device), objs.to(device), boxes.to(device), masks.to(device), obj_to_img
for zi in range(config.num_multimodal):
    z = torch.randn(objs.size(0), config.z_dim).to(device)

    # Generate fake image
    outputs = netG(imgs, objs, boxes, masks, obj_to_img, z)
    crops_input, crops_input_rec, crops_rand, img_rec, img_rand, mu, logvar, z_rand_rec = outputs

and computed LPIPS with official implementation but I could hardly reproduce numbers in the paper.
I got 0.057.

Any help would be highly appreciated.

Best,
Youngjung

How did you get your is score and fid?

Hi zhaobozb, thanks for giving the code and the awesome work. When i run the code, i got some problems about metrics. When i trying to run the code that you are describing in the github. I followed the most your description except the batch_size and iteration time ,i set the batch_size to 16 and iteration 150000, the GPU i used is 1080Ti, but unfortunately my result is not matching with yours in the paper.While you tested the IS score 9.1 points for me the result is only have 1.2. Did i got something wrong?Can you give me the metrics code or do you have some advice for me how i can get a bigger result?

How did you calculate the FID

Hi, thanks for the fantastic work and sharing the code. I run the pertrained model, and calculated the FID, but did not get FID as your cvpr version (31 for vg, I got 39). Did I do something wrong? How many images did you use for calculating FID? What about image size?

Webpage not working

Hi!

I'm getting a 502 Bad Gateway when trying to access the online demo.

Could you please fix it?

Kind regards,

Carlos

Problem with the reconstruction loss

Hi Bo,

first of all thank you for providing the code to the paper - I found it really good to work with. However, running your training script code for the coco dataset (without changing the default values) I run into a problem. The reconstructed real images look just like the fake generated ones, but not like the real original images.
I uploaded the tensorboard output here:
Unbenanntes Dokument(1).pdf

I also tried to increase the reconstruction loss - which helped in the sense that real_rec and fake (random) do not look so similar anymore, but the reconstruction still does not work very well.
Do you have an idea what could be the problem here?

Best,
Katja

How did you get your IS and FID

Hi zhaobozb, thanks for giving the code and the awesome work. When i run the code, i got some problems about metrics. When i trying to run the code that you are describing in the github. I followed the whole your description about hyperparameters . I tested your trained models on 2080Ti and 1080Ti respectively, but unfortunately my result is not matching with yours in the paper.While you tested the IS score and FID are 9.1 points and 38.14 on COCO-stuff for me the result are only have 9.1 and 42.8. The IS score is 8.1 points on VG dataset for me the result is 7.8. Did i got something wrong?Do you have some advice for me how i can get a bigger result?
Thanks, best wishes!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.