Coder Social home page Coder Social logo

sidward14 / style-attngan Goto Github PK

View Code? Open in Web Editor NEW
57.0 57.0 16.0 38.66 MB

Improves Text to Image synthesis from AttnGAN by integrating the scale-specific control from StyleGAN; can optionally use GPT-2 as text encoder

License: Other

Python 100.00%

style-attngan's People

Contributors

sidward14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

style-attngan's Issues

flowers dataset

Can I use oxford 102 flower dataset instead of birds dataset in this code?
If yes can you help me how to do this?

Word features and sentence feature

picture

My question is that if you use GPT-2 instead of LSTM for the text encoder, how did you calculate the word features and sentence feature? Can you explain more details?

load model

Hello, how can I use the saved model from output/model/
I'm running code on google colab and I must to save model after some epochs.
Please guide me do this.

Custom Dataset

Hi, I have seen your instructions for using a custom dataset. However, when I was trying to execute the pretrain_DAMSM.py, the program required captions.pickle file in the dataset path. It seems like I should do some data preprocessing work before the training. Do you have any ideas? Thanks!

About dimension problem when training GAN

In “D_GET_LOGITS” - “forward”
h_c_code = torch.cat((h_code, c_code), 1)
there is an error:

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 8 but got size 4 for tensor number 1 in the list.

I dont know the reason

Which epoch of text encoder and image encoder to use ?

Hello! I pre-trained the DAMSM + GPT-2 model for 600 epochs, and I have saved files for the text encoder and image encoder from text_encoder50.pth and image_encoder50.pth to image_encoder550.pth and image_encoder550.pth. I want to ask which epoch I should use for both the text encoder and image encoder to train the GAN

'

'

Working of gpt2 not clear

Whenever i try to train or evaluate using transformer flag with command, im getting error. It would be great if an example was provided on how to train with gpt2 transformer and results.
Error:
Traceback (most recent call last):
File "main.py", line 158, in
algo.train()
File "/content/drive/My Drive/pj/Style-AttnGAN-master/code/trainer.py", line 311, in train
text_encoder, image_encoder, netG, netsD, start_epoch = self.build_models()
File "/content/drive/My Drive/pj/Style-AttnGAN-master/code/trainer.py", line 79, in build_models
text_encoder.load_state_dict(state_dict)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for GPT2Model:
Missing key(s) in state_dict: "wte.weight", "wpe.weight", "h.0.ln_1.weight", "h.0.ln_1.bias", "h.0.attn.bias", "h.0.attn.masked_bias", "h.0.attn.c_attn.weight", "h.0.attn.c_attn.bias", "h.0.attn.c_proj.weight", "h.0.attn.c_proj.bias", "h.0.ln_2.weight", "h.0.ln_2.bias", "h.0.mlp.c_fc.weight", "h.0.mlp.c_fc.bias", "h.0.mlp.c_proj.weight", "h.0.mlp.c_proj.bias", "h.1.ln_1.weight", "h.1.ln_1.bias", "h.1.attn.bias", "h.1.attn.masked_bias", "h.1.attn.c_attn.weight", "h.1.attn.c_attn.bias", "h.1.attn.c_proj.weight", "h.1.attn.c_proj.bias", "h.1.ln_2.weight", "h.1.ln_2.bias", "h.1.mlp.c_fc.weight", "h.1.mlp.c_fc.bias", "h.1.mlp.c_proj.weight", "h.1.mlp.c_proj.bias", "h.2.ln_1.weight", "h.2.ln_1.bias", "h.2.attn.bias", "h.2.attn.masked_bias", "h.2.attn.c_attn.weight", "h.2.attn.c_attn.bias", "h.2.attn.c_proj.weight", "h.2.attn.c_proj.bias", "h.2.ln_2.weight", "h.2.ln_2.bias", "h.2.mlp.c_fc.weight", "h.2.mlp.c_fc.bias", "h.2.mlp.c_proj.weight", "h.2.mlp.c_proj.bias", "h.3.ln_1.weight", "h.3.ln_1.bias", "h.3.attn.bias", "h.3.attn.masked_bias", "h.3.attn.c_attn.weight", "h.3.attn.c_attn.bias", "h.3.attn.c_proj.weight", "h.3.attn.c_proj.bias", "h.3.ln_2.weight", "h.3.ln_2.bias", "h.3.mlp.c_fc.weight", "h.3.mlp.c_fc.bias", "h.3.mlp.c_proj.weight", "h.3.mlp.c_proj.bias", "h.4.ln_1.weight", "h.4.ln_1.bias", "h.4.attn.bias", "h.4.attn.masked_bias", "h.4.attn.c_attn.weight", "h.4.attn.c_attn.bias", "h.4.attn.c_proj.weight", "h.4.attn.c_proj.bias", "h.4.ln_2.weight", "h.4.ln_2.bias", "h.4.mlp.c_fc.weight", "h.4.mlp.c_fc.bias", "h.4.mlp.c_proj.weight", "h.4.mlp.c_proj.bias", "h.5.ln_1.weight", "h.5.ln_1.bias", "h.5.attn.bias", "h.5.attn.masked_bias", "h.5.attn.c_attn.weight", "h.5.attn.c_attn.bias", "h.5.attn.c_proj.weight", "h.5.attn.c_proj.bias", "h.5.ln_2.weight", "h.5.ln_2.bias", "h.5.mlp.c_fc.weight", "h.5.mlp.c_fc.bias", "h.5.mlp.c_proj.weight", "h.5.mlp.c_proj.bias", "h.6.ln_1.weight", "h.6.ln_1.bias", "h.6.attn.bias", "h.6.attn.masked_bias", "h.6.attn.c_attn.weight", "h.6.attn.c_attn.bias", "h.6.attn.c_proj.weight", "h.6.attn.c_proj.bias", "h.6.ln_2.weight", "h.6.ln_2.bias", "h.6.mlp.c_fc.weight", "h.6.mlp.c_fc.bias", "h.6.mlp.c_proj.weight", "h.6.mlp.c_proj.bias", "h.7.ln_1.weight", "h.7.ln_1.bias", "h.7.attn.bias", "h.7.attn.masked_bias", "h.7.attn.c_attn.weight", "h.7.attn.c_attn.bias", "h.7.attn.c_proj.weight", "h.7.attn.c_proj.bias", "h.7.ln_2.weight", "h.7.ln_2.bias", "h.7.mlp.c_fc.weight", "h.7.mlp.c_fc.bias", "h.7.mlp.c_proj.weight", "h.7.mlp.c_proj.bias", "h.8.ln_1.weight", "h.8.ln_1.bias", "h.8.attn.bias", "h.8.attn.masked_bias", "h.8.attn.c_attn.weight", "h.8.attn.c_attn.bias", "h.8.attn.c_proj.weight", "h.8.attn.c_proj.bias", "h.8.ln_2.weight", "h.8.ln_2.bias", "h.8.mlp.c_fc.weight", "h.8.mlp.c_fc.bias", "h.8.mlp.c_proj.weight", "h.8.mlp.c_proj.bias", "h.9.ln_1.weight", "h.9.ln_1.bias", "h.9.attn.bias", "h.9.attn.masked_bias", "h.9.attn.c_attn.weight", "h.9.attn.c_attn.bias", "h.9.attn.c_proj.weight", "h.9.attn.c_proj.bias", "h.9.ln_2.weight", "h.9.ln_2.bias", "h.9.mlp.c_fc.weight", "h.9.mlp.c_fc.bias", "h.9.mlp.c_proj.weight", "h.9.mlp.c_proj.bias", "h.10.ln_1.weight", "h.10.ln_1.bias", "h.10.attn.bias", "h.10.attn.masked_bias", "h.10.attn.c_attn.weight", "h.10.attn.c_attn.bias", "h.10.attn.c_proj.weight", "h.10.attn.c_proj.bias", "h.10.ln_2.weight", "h.10.ln_2.bias", "h.10.mlp.c_fc.weight", "h.10.mlp.c_fc.bias", "h.10.mlp.c_proj.weight", "h.10.mlp.c_proj.bias", "h.11.ln_1.weight", "h.11.ln_1.bias", "h.11.attn.bias", "h.11.attn.masked_bias", "h.11.attn.c_attn.weight", "h.11.attn.c_attn.bias", "h.11.attn.c_proj.weight", "h.11.attn.c_proj.bias", "h.11.ln_2.weight", "h.11.ln_2.bias", "h.11.mlp.c_fc.weight", "h.11.mlp.c_fc.bias", "h.11.mlp.c_proj.weight", "h.11.mlp.c_proj.bias", "ln_f.weight", "ln_f.bias".
Unexpected key(s) in state_dict: "encoder.weight", "rnn.weight_ih_l0", "rnn.weight_hh_l0", "rnn.bias_ih_l0", "rnn.bias_hh_l0", "rnn.weight_ih_l0_reverse", "rnn.weight_hh_l0_reverse", "rnn.bias_ih_l0_reverse", "rnn.bias_hh_l0_reverse".

The coco dataset

Have you trained this repo on the coco dataset ? Can this repo be trained on the coco dataset ? Thanks

Styled netG, Unstyled Images?

Hi Sidhartha --

We're doing a final project for school class (NCSU). We used your adaptation of the model to make images from the oxford 102 flowers dataset. It's great thanks a lot!

We've trained the model using the style netG.

Is there any way now to get un-styled images without retraining? i.e. we have a netG.pth that evaluates un-styled for 64 and 128 images and styled images for the 256 images. I'd like to also generate the unstyled 256 images Is there a way to do this without re-training w/ GAN.B_STYLEGEN = False?

(we all have weak computers and we couldn't get it to run on the school servers, so we had to rent a server to run on which is why we don't want to re-train)

(I /think/ I saw unstyled 256 images in the training output images attention maps. I /think/ the images are styled ha - they sure look different from the 64/128 imgs. Maybe this question is way out in left field.)

Thanks!
Dennis

Error when pretraining the DAMSM model ?

Hello @sidward14, thanks for your work, I did the same steps for preparing the bird dataset, downloaded and unzipped the file, used the command to pretrain the DAMSM model for the bird dataset but I still got the error:

File "Style-AttnGAN\code\datasets.py", line 87, in get_imgs

re_img = transforms.Resize(imsize[i])(img)
IndexError: list index out of range

Do you know how to fix this error, thanks for your help !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.