zhaobozb / layout2im Goto Github PK
View Code? Open in Web Editor NEWOfficial PyTorch Implementation of Image Generation from Layout - CVPR 2019
Home Page: https://layout2im.cs.ubc.ca
License: Apache License 2.0
Official PyTorch Implementation of Image Generation from Layout - CVPR 2019
Home Page: https://layout2im.cs.ubc.ca
License: Apache License 2.0
The code writting is SO COMFORTABLE for me, very brilliant work!
I got a shape mismatch trying to use the model to generate an image from a layout.
Model input:
# Build up a model
# Images 256 x 256
h, w = 256, 256
# Number of objects
o = 2
objects = torch.randint(179, [1]).repeat(o)
boxes = torch.rand([1, 4]).repeat(o, 1)
masks = torch.randint(1, [1, 1, h, w]).repeat(o, 1, 1, 1)
obj_to_img = torch.tensor([0]).repeat(o)
z_rand = torch.randn([1, 64]).repeat(o, 1)
model(objects, boxes, masks, obj_to_img, z_rand)
A slightly modified version of your Generator
model. this one, the ImageGenerator
doesn't require ground truth images to provide.
class ImageGenerator(nn.Module):
def __init__(self, num_embeddings, embedding_dim=64, z_dim=8, obj_size=32, clstm_layers=3):
super(ImageGenerator, self).__init__()
self.obj_size = obj_size
# (3, 32, 32) -> (256, 4, 4) -> 8
self.crop_encoder = CropEncoder(z_dim=z_dim, class_num=num_embeddings)
self.layout_encoder = LayoutEncoder(
z_dim=z_dim,
embedding_dim=embedding_dim,
class_num=num_embeddings,
clstm_layers=clstm_layers
)
self.decoder = Decoder()
def forward(self, objs, boxes, masks, obj_to_img, z_rand):
# (n, clstm_dim * 2, 8, 8)
h_rand = self.layout_encoder(objs, masks, obj_to_img, z_rand)
img_rand = self.decoder(h_rand)
crops_rand = crop_bbox_batch(
img_rand, boxes, obj_to_img, self.obj_size)
_, z_rand_rec, _ = self.crop_encoder(crops_rand, objs)
return crops_rand, img_rand, z_rand_rec
It says:
RuntimeError: Sizes of tensors must match except in dimension 1. Got 32 and 8 in dimension 2 (The offending index is 1)
On the line:
Line 98 in 031f65b
Hi Bo,
Thank you for sharing your nice work.
I modified
Lines 38 to 44 in 031f65b
imgs, objs, boxes, masks, obj_to_img = batch
imgs, objs, boxes, masks, obj_to_img = imgs.to(device), objs.to(device), boxes.to(device), masks.to(device), obj_to_img
for zi in range(config.num_multimodal):
z = torch.randn(objs.size(0), config.z_dim).to(device)
# Generate fake image
outputs = netG(imgs, objs, boxes, masks, obj_to_img, z)
crops_input, crops_input_rec, crops_rand, img_rec, img_rand, mu, logvar, z_rand_rec = outputs
and computed LPIPS with official implementation but I could hardly reproduce numbers in the paper.
I got 0.057.
Any help would be highly appreciated.
Best,
Youngjung
Hi zhaobozb, thanks for giving the code and the awesome work. When i run the code, i got some problems about metrics. When i trying to run the code that you are describing in the github. I followed the most your description except the batch_size and iteration time ,i set the batch_size to 16 and iteration 150000, the GPU i used is 1080Ti, but unfortunately my result is not matching with yours in the paper.While you tested the IS score 9.1 points for me the result is only have 1.2. Did i got something wrong?Can you give me the metrics code or do you have some advice for me how i can get a bigger result?
Thanks for your impressive work! I have the following questions:
Hi bo:
Can you share the code for generating outputs in the various layout structures like written in your paper.
Best regards
Hi, bo,
first thanks for your brilliant work, but in my opinion, the padding in self.c1 of LayoutEncoder should be 0. Although it may have no effect with padding=1.
Hi, thanks for the fantastic work and sharing the code. I run the pertrained model, and calculated the FID, but did not get FID as your cvpr version (31 for vg, I got 39). Did I do something wrong? How many images did you use for calculating FID? What about image size?
Hi!
I'm getting a 502 Bad Gateway when trying to access the online demo.
Could you please fix it?
Kind regards,
Carlos
Hi Bo,
first of all thank you for providing the code to the paper - I found it really good to work with. However, running your training script code for the coco dataset (without changing the default values) I run into a problem. The reconstructed real images look just like the fake generated ones, but not like the real original images.
I uploaded the tensorboard output here:
Unbenanntes Dokument(1).pdf
I also tried to increase the reconstruction loss - which helped in the sense that real_rec and fake (random) do not look so similar anymore, but the reconstruction still does not work very well.
Do you have an idea what could be the problem here?
Best,
Katja
Hi zhaobozb, thanks for giving the code and the awesome work. When i run the code, i got some problems about metrics. When i trying to run the code that you are describing in the github. I followed the whole your description about hyperparameters . I tested your trained models on 2080Ti and 1080Ti respectively, but unfortunately my result is not matching with yours in the paper.While you tested the IS score and FID are 9.1 points and 38.14 on COCO-stuff for me the result are only have 9.1 and 42.8. The IS score is 8.1 points on VG dataset for me the result is 7.8. Did i got something wrong?Do you have some advice for me how i can get a bigger result?
Thanks, best wishes!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.