Coder Social home page Coder Social logo

vlcstorygan's People

Contributors

adymaharana avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

vlcstorygan's Issues

The error: “Unexpected key(s) in state_dict: ‘epoch’, ‘netG_state_dict’, ‘optimizer_state_dict’.” when resume training.

I have an error: “Unexpected key(s) in state_dict: ‘epoch’, ‘netG_state_dict’, ‘optimizer_state_dict’.” when resume training. (below lines are full error, and I added my trainer_vlc.py code at the bottom.)

Would you let me know how to load model correctly?

File "/project/6057220/xianzhen/storygan/vlcgan/trainer_vlc.py", line 110, in load_network_stageI
netG.load_state_dict(state_dict)
File "/home/xianzhen/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for StoryMarttGAN:
Missing key(s) in state_dict: "recurrent.weight_ih", "recurrent.weight_hh", "recurrent.bias_ih", "recurrent.bias_hh", "moconn.layer.0.attention.self.query.weight", "moconn.layer.0.attention.self.query.bias", "moconn.layer.0.attention.self.key.weight", "moconn.layer.0.attention.self.key.bias", "moconn.layer.0.attention.self.value.weight", "moconn.layer.0.attention.self.value.bias", "moconn.layer.0.attention.output.dense.weight", "moconn.layer.0.attention.output.dense.bias", "moconn.layer.0.attention.output.LayerNorm.weight", "moconn.layer.0.attention.output.LayerNorm.bias", "moconn.layer.0.memory_initilizer.init_memory_bias", "moconn.layer.0.memory_initilizer.init_memory_fc.0.weight", "moconn.layer.0.memory_initilizer.init_memory_fc.0.bias", "moconn.layer.0.memory_initilizer.init_memory_fc.1.weight", "moconn.layer.0.memory_initilizer.init_memory_fc.1.bias", "moconn.layer.0.memory_updater.memory_update_attention.query.weight", "moconn.layer.0.memory_updater.memory_update_attention.query.bias", "moconn.layer.0.memory_updater.memory_update_attention.key.weight", "moconn.layer.0.memory_updater.memory_update_attention.key.bias", "moconn.layer.0.memory_updater.memory_update_attention.value.weight", "moconn.layer.0.memory_updater.memory_update_attention.value.bias", "moconn.layer.0.memory_updater.mc.weight", "moconn.layer.0.memory_updater.sc.weight", "moconn.layer.0.memory_updater.sc.bias", "moconn.layer.0.memory_updater.mz.weight", "moconn.layer.0.memory_updater.sz.weight", "moconn.layer.0.memory_updater.sz.bias", "moconn.layer.0.memory_augmented_attention.query.weight", "moconn.layer.0.memory_augmented_attention.query.bias", "moconn.layer.0.memory_augmented_attention.key.weight", "moconn.layer.0.memory_augmented_attention.key.bias", "moconn.layer.0.memory_augmented_attention.value.weight", "moconn.layer.0.memory_augmented_attention.value.bias", "moconn.layer.0.hidden_intermediate.dense.weight", "moconn.layer.0.hidden_intermediate.dense.bias", "moconn.layer.0.memory_projection.weight", "moconn.layer.0.memory_projection.bias", "moconn.layer.0.output.dense.weight", "moconn.layer.0.output.dense.bias", "moconn.layer.0.output.LayerNorm.weight", "moconn.layer.0.output.LayerNorm.bias", "moconn.layer.1.attention.self.query.weight", "moconn.layer.1.attention.self.query.bias", "moconn.layer.1.attention.self.key.weight", "moconn.layer.1.attention.self.key.bias", "moconn.layer.1.attention.self.value.weight", "moconn.layer.1.attention.self.value.bias", "moconn.layer.1.attention.output.dense.weight", "moconn.layer.1.attention.output.dense.bias", "moconn.layer.1.attention.output.LayerNorm.weight", "moconn.layer.1.attention.output.LayerNorm.bias", "moconn.layer.1.memory_initilizer.init_memory_bias", "moconn.layer.1.memory_initilizer.init_memory_fc.0.weight", "moconn.layer.1.memory_initilizer.init_memory_fc.0.bias", "moconn.layer.1.memory_initilizer.init_memory_fc.1.weight", "moconn.layer.1.memory_initilizer.init_memory_fc.1.bias", "moconn.layer.1.memory_updater.memory_update_attention.query.weight", "moconn.layer.1.memory_updater.memory_update_attention.query.bias", "moconn.layer.1.memory_updater.memory_update_attention.key.weight", "moconn.layer.1.memory_updater.memory_update_attention.key.bias", "moconn.layer.1.memory_updater.memory_update_attention.value.weight", "moconn.layer.1.memory_updater.memory_update_attention.value.bias", "moconn.layer.1.memory_updater.mc.weight", "moconn.layer.1.memory_updater.sc.weight", "moconn.layer.1.memory_updater.sc.bias", "moconn.layer.1.memory_updater.mz.weight", "moconn.layer.1.memory_updater.sz.weight", "moconn.layer.1.memory_updater.sz.bias", "moconn.layer.1.memory_augmented_attention.query.weight", "moconn.layer.1.memory_augmented_attention.query.bias", "moconn.layer.1.memory_augmented_attention.key.weight", "moconn.layer.1.memory_augmented_attention.key.bias", "moconn.layer.1.memory_augmented_attention.value.weight", "moconn.layer.1.memory_augmented_attention.value.bias", "moconn.layer.1.hidden_intermediate.dense.weight", "moconn.layer.1.hidden_intermediate.dense.bias", "moconn.layer.1.memory_projection.weight", "moconn.layer.1.memory_projection.bias", "moconn.layer.1.output.dense.weight", "moconn.layer.1.output.dense.bias", "moconn.layer.1.output.LayerNorm.weight", "moconn.layer.1.output.LayerNorm.bias", "moconn.layer.2.attention.self.query.weight", "moconn.layer.2.attention.self.query.bias", "moconn.layer.2.attention.self.key.weight", "moconn.layer.2.attention.self.key.bias", "moconn.layer.2.attention.self.value.weight", "moconn.layer.2.attention.self.value.bias", "moconn.layer.2.attention.output.dense.weight", "moconn.layer.2.attention.output.dense.bias", "moconn.layer.2.attention.output.LayerNorm.weight", "moconn.layer.2.attention.output.LayerNorm.bias", "moconn.layer.2.memory_initilizer.init_memory_bias", "moconn.layer.2.memory_initilizer.init_memory_fc.0.weight", "moconn.layer.2.memory_initilizer.init_memory_fc.0.bias", "moconn.layer.2.memory_initilizer.init_memory_fc.1.weight", "moconn.layer.2.memory_initilizer.init_memory_fc.1.bias", "moconn.layer.2.memory_updater.memory_update_attention.query.weight", "moconn.layer.2.memory_updater.memory_update_attention.query.bias", "moconn.layer.2.memory_updater.memory_update_attention.key.weight", "moconn.layer.2.memory_updater.memory_update_attention.key.bias", "moconn.layer.2.memory_updater.memory_update_attention.value.weight", "moconn.layer.2.memory_updater.memory_update_attention.value.bias", "moconn.layer.2.memory_updater.mc.weight", "moconn.layer.2.memory_updater.sc.weight", "moconn.layer.2.memory_updater.sc.bias", "moconn.layer.2.memory_updater.mz.weight", "moconn.layer.2.memory_updater.sz.weight", "moconn.layer.2.memory_updater.sz.bias", "moconn.layer.2.memory_augmented_attention.query.weight", "moconn.layer.2.memory_augmented_attention.query.bias", "moconn.layer.2.memory_augmented_attention.key.weight", "moconn.layer.2.memory_augmented_attention.key.bias", "moconn.layer.2.memory_augmented_attention.value.weight", "moconn.layer.2.memory_augmented_attention.value.bias", "moconn.layer.2.hidden_intermediate.dense.weight", "moconn.layer.2.hidden_intermediate.dense.bias", "moconn.layer.2.memory_projection.weight", "moconn.layer.2.memory_projection.bias", "moconn.layer.2.output.dense.weight", "moconn.layer.2.output.dense.bias", "moconn.layer.2.output.LayerNorm.weight", "moconn.layer.2.output.LayerNorm.bias", "moconn.layer.3.attention.self.query.weight", "moconn.layer.3.attention.self.query.bias", "moconn.layer.3.attention.self.key.weight", "moconn.layer.3.attention.self.key.bias", "moconn.layer.3.attention.self.value.weight", "moconn.layer.3.attention.self.value.bias", "moconn.layer.3.attention.output.dense.weight", "moconn.layer.3.attention.output.dense.bias", "moconn.layer.3.attention.output.LayerNorm.weight", "moconn.layer.3.attention.output.LayerNorm.bias", "moconn.layer.3.memory_initilizer.init_memory_bias", "moconn.layer.3.memory_initilizer.init_memory_fc.0.weight", "moconn.layer.3.memory_initilizer.init_memory_fc.0.bias", "moconn.layer.3.memory_initilizer.init_memory_fc.1.weight", "moconn.layer.3.memory_initilizer.init_memory_fc.1.bias", "moconn.layer.3.memory_updater.memory_update_attention.query.weight", "moconn.layer.3.memory_updater.memory_update_attention.query.bias", "moconn.layer.3.memory_updater.memory_update_attention.key.weight", "moconn.layer.3.memory_updater.memory_update_attention.key.bias", "moconn.layer.3.memory_updater.memory_update_attention.value.weight", "moconn.layer.3.memory_updater.memory_update_attention.value.bias", "moconn.layer.3.memory_updater.mc.weight", "moconn.layer.3.memory_updater.sc.weight", "moconn.layer.3.memory_updater.sc.bias", "moconn.layer.3.memory_updater.mz.weight", "moconn.layer.3.memory_updater.sz.weight", "moconn.layer.3.memory_updater.sz.bias", "moconn.layer.3.memory_augmented_attention.query.weight", "moconn.layer.3.memory_augmented_attention.query.bias", "moconn.layer.3.memory_augmented_attention.key.weight", "moconn.layer.3.memory_augmented_attention.key.bias", "moconn.layer.3.memory_augmented_attention.value.weight", "moconn.layer.3.memory_augmented_attention.value.bias", "moconn.layer.3.hidden_intermediate.dense.weight", "moconn.layer.3.hidden_intermediate.dense.bias", "moconn.layer.3.memory_projection.weight", "moconn.layer.3.memory_projection.bias", "moconn.layer.3.output.dense.weight", "moconn.layer.3.output.dense.bias", "moconn.layer.3.output.LayerNorm.weight", "moconn.layer.3.output.LayerNorm.bias", "pooler.context_vector", "pooler.fc.0.weight", "pooler.fc.0.bias", "pooler.fc.1.weight", "pooler.fc.1.bias", "pooler.fc.1.running_mean", "pooler.fc.1.running_var", "embeddings.word_embeddings.weight", "embeddings.word_fc.0.weight", "embeddings.word_fc.0.bias", "embeddings.word_fc.2.weight", "embeddings.word_fc.2.bias", "embeddings.word_fc.4.weight", "embeddings.word_fc.4.bias", "embeddings.position_embeddings.pe", "embeddings.LayerNorm.weight", "embeddings.LayerNorm.bias", "tag_embeddings.weight", "map_embed.weight", "map_embed.bias", "ca_net.fc.weight", "ca_net.fc.bias", "fc.0.weight", "fc.1.weight", "fc.1.bias", "fc.1.running_mean", "fc.1.running_var", "filter_net.0.weight", "filter_net.0.bias", "filter_net.1.weight", "filter_net.1.bias", "filter_net.1.running_mean", "filter_net.1.running_var", "image_net.0.weight", "image_net.0.bias", "image_net.1.weight", "image_net.1.bias", "image_net.1.running_mean", "image_net.1.running_var", "mart_fc.0.weight", "mart_fc.0.bias", "mart_fc.1.weight", "mart_fc.1.bias", "mart_fc.1.running_mean", "mart_fc.1.running_var", "upsample1.1.weight", "upsample1.2.weight", "upsample1.2.bias", "upsample1.2.running_mean", "upsample1.2.running_var", "upsample2.1.weight", "upsample2.2.weight", "upsample2.2.bias", "upsample2.2.running_mean", "upsample2.2.running_var", "upsample3.1.weight", "upsample3.2.weight", "upsample3.2.bias", "upsample3.2.running_mean", "upsample3.2.running_var", "next_g.att.conv_context.weight", "next_g.att.conv_sentence_vis.weight", "next_g.att.linear.weight", "next_g.att.linear.bias", "next_g.residual.0.block.0.weight", "next_g.residual.0.block.1.weight", "next_g.residual.0.block.1.bias", "next_g.residual.0.block.1.running_mean", "next_g.residual.0.block.1.running_var", "next_g.residual.0.block.3.weight", "next_g.residual.0.block.4.weight", "next_g.residual.0.block.4.bias", "next_g.residual.0.block.4.running_mean", "next_g.residual.0.block.4.running_var", "next_g.residual.1.block.0.weight", "next_g.residual.1.block.1.weight", "next_g.residual.1.block.1.bias", "next_g.residual.1.block.1.running_mean", "next_g.residual.1.block.1.running_var", "next_g.residual.1.block.3.weight", "next_g.residual.1.block.4.weight", "next_g.residual.1.block.4.bias", "next_g.residual.1.block.4.running_mean", "next_g.residual.1.block.4.running_var", "next_g.residual.2.block.0.weight", "next_g.residual.2.block.1.weight", "next_g.residual.2.block.1.bias", "next_g.residual.2.block.1.running_mean", "next_g.residual.2.block.1.running_var", "next_g.residual.2.block.3.weight", "next_g.residual.2.block.4.weight", "next_g.residual.2.block.4.bias", "next_g.residual.2.block.4.running_mean", "next_g.residual.2.block.4.running_var", "next_g.residual.3.block.0.weight", "next_g.residual.3.block.1.weight", "next_g.residual.3.block.1.bias", "next_g.residual.3.block.1.running_mean", "next_g.residual.3.block.1.running_var", "next_g.residual.3.block.3.weight", "next_g.residual.3.block.4.weight", "next_g.residual.3.block.4.bias", "next_g.residual.3.block.4.running_mean", "next_g.residual.3.block.4.running_var", "next_g.upsample.1.weight", "next_g.upsample.2.weight", "next_g.upsample.2.bias", "next_g.upsample.2.running_mean", "next_g.upsample.2.running_var", "next_g.conv.weight", "next_img.0.weight", "next_img_.0.weight", "m_net.0.weight", "m_net.0.bias", "m_net.1.weight", "m_net.1.bias", "m_net.1.running_mean", "m_net.1.running_var", "c_net.0.weight", "c_net.0.bias", "c_net.1.weight", "c_net.1.bias", "c_net.1.running_mean", "c_net.1.running_var".
Unexpected key(s) in state_dict: "epoch", "netG_state_dict", "optimizer_state_dict".
def load_network_stageI(self):
        from .model import StoryGAN, STAGE1_D_IMG, STAGE1_D_STY_V2, StoryMarttGAN

        if self.use_martt:
            netG = StoryMarttGAN(self.cfg, self.video_len)
        else:
            netG = StoryGAN(self.cfg, self.video_len)
        netG.apply(weights_init)
        print(netG)

        if self.cfg.NET_G != '':
            state_dict = \
                torch.load(self.cfg.NET_G,
                           map_location=lambda storage, loc: storage)
            netG.load_state_dict(state_dict)
            print('Load from: ', self.cfg.NET_G)

        if self.use_image_disc:
            if self.cfg.DATASET_NAME == 'youcook2':
                use_categories = False
            else:
                use_categories = True

            netD_im = STAGE1_D_IMG(self.cfg, use_categories=use_categories)
            netD_im.apply(weights_init)
            print(netD_im)

            if self.cfg.NET_D_IM != '':
                state_dict = \
                    torch.load(self.cfg.NET_D_IM,
                               map_location=lambda storage, loc: storage)
                netD_im.load_state_dict(state_dict)
                print('Load from: ', self.cfg.NET_D_IM)
        else:
            netD_im = None

        if self.use_story_disc:
            netD_st = STAGE1_D_STY_V2(self.cfg)
            netD_st.apply(weights_init)
            # for m in netD_st.modules():
            #     print(m.__class__.__name__)
            print(netD_st)

            if self.cfg.NET_D_ST != '':
                state_dict = \
                    torch.load(self.cfg.NET_D_ST,
                               map_location=lambda storage, loc: storage)
                netD_st.load_state_dict(state_dict)
                print('Load from: ', self.cfg.NET_D_ST)
        else:
            netD_st = None

missing const_tag2idx.json

Hi, very nice work. Thank you for sharing your code.
When I run the code, it reports an error that const_tag2idx.json is missing.
How can I get this file?
Thank you

Question about FID score

Hi, thank you for your great work!

I have a question about implementing how to evaluate the FID score on your generated images.
I tried to reproduce FID score using your pre-trained weight of DuCo-StoryGAN, but I couldn't reproduce your results shown in table 1 in your paper.

Could you elaborate about how to reproduce your FID score?

Thanks!

desdescription file

I made some changes in Desdescription file cuz I need to enter only 5 statements to make one story
so I want to know how did u make the desdescription_vec & desdescription_attr & desdescription.npy files ????????

How to processing multi GPU?

Hello,

I'd like to processing with multi GPU. So I set the gpu_id from 0 to 0,1.
GPU_ID: '0,1' in pororo_s1_vlc.yml .
And I used nn.parallel.data_parallel too. (

# nn.parallel.data_parallel(netG.sample_videos, st_inputs, self.gpus)
)
But I got an error: AttributeError: 'DataParallel' object has no attribute 'sample_videos'.

Traceback (most recent call last):
  File "train_vlcgan.py", line 225, in <module>
    PIL.Image.fromarray,
  File "/project/6057220/xianzhen/storygan/vlcgan/trainer_vlc.py", line 353, in train
    lr_st_fake, st_fake, m_mu, m_logvar, c_mu, c_logvar, s_word = netG.sample_videos(*st_inputs)
  File "/home/xianzhen/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1177, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'DataParallel' object has no attribute 'sample_videos'
  1. how to I solve it?
  2. Is there any reason you didn't use DistributedDataParallel?

Getting a key error while training the VLCGan

I am getting the key error ":" (The missing key is ':') in the story loader .

File "train_vlcgan.py", line 206, in
algo.train(imageloader, storyloader, testloader, cfg.STAGE)
File "/home/dwivedi7/VLCStoryGan/vlcgan/trainer_vlc.py", line 246, in train
for i, data in tqdm(enumerate(storyloader, 0)):
File "/opt/conda/lib/python3.7/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 461, in reraise
raise exception
KeyError: Caught KeyError in DataLoader worker process 0..

Please let me know if this issue can be resolved.

mat1 and mat2 shapes cannot be multiplied error.

Hi,

When I run the train as specified in the Readme , I encounter the issue seen in the screenshot below: mat1 and mat2 shapes cannot be multiplied. And when I tried debugging, I saw issues in algo.train() -> netG.sample_videos -> self.ca_net. And the reason of this error is that the content_input shape is [12x1780] , while the fc in self.ca_net in/output size is [640x248].

image

I'm unsure whether the dimension is incorrect, but I modified the fc layer dimension to run the code to the end. Unfortunately, the same errors have occurred several times after that, and since there is a case that the same Class is used in another function, I can't modify the layer anymore.

How can I solve it?
Or do I have a problem prior to training, such as 'prepare repository, extract constituency parses or dense captions', despite the fact that no issue has been shown to me?

Parser error

Hello.

I am getting this below error:

NameError: name 'parser' is not defined

why there is logits_per_image.t() in contrastive loss?

Hello, there!

In the

def compute_contrastive_loss(netD, img_features, text_features, gpus):
, there is two kinds of loss values (loss_i and loss_t) in line 184 and 185.
And the function calculate the average of them.
loss = (loss_i + loss_t)/2

Why to calculate like this? I didn't figure out the meaning of this. And what is the difference with the loss below?
loss = loss_fct(logits_per_image, labels)

Loading parser: `nlp.add_pipe` now takes the string name of the registered component factory, not a callable component. Expected string, but got <benepar.integrations.spacy_plugin.BeneparComponent object at 0x0000026D190BAB50> (name: 'None').

I am not able to load the parser correctly after implementing the parse.py code, knowing that I am using benepar_en3 (integrated with spaCy 3.2.0).
Error:
alueError: [E966] nlp.add_pipe now takes the string name of the registered component factory, not a callable component. Expected string, but got <benepar.integrations.spacy_plugin.BeneparComponent object at 0x0000026D190BAB50> (name: 'None').

  • If you created your component with nlp.create_pipe('name'): remove nlp.create_pipe and call nlp.add_pipe('name') instead.

  • If you passed in a component like TextCategorizer(): call nlp.add_pipe with the string name instead, e.g. nlp.add_pipe('textcat').

  • If you're using a custom component: Add the decorator @Language.component (for function components) or @Language.factory (for class components / factories) to your custom component and assign it a name, e.g. @Language.component('your_name'). You can then run nlp.add_pipe('your_name') to add it to the pipeline.

Loading Parser Problem

Hello.

Thanks for sharing this superb and promising work!

I am facing some problems when loading the parser.
I have tried to keep the parse.py as it is but I got the following:

Loading parser
usage: [-h] [--sum] N [N ...]
: error: the following arguments are required: N
An exception has occurred, use %tb to see the full traceback.
SystemExit: 2

But when I changed the parser.add_argument from positional arguments to the different argument I got the following:
"args has no attribute dataset".

Knowing that my args type is as outlined below:
Namespace(**{'data_dir <path_to_data_directory>': None, 'dataset pororo': 'pororo'})
which seems unworkable.

checkpoint

Hello,
I'd like to finetune parameter on your model . Could you provide the checkpoint of netD_im and netD_st which are not given ?
Thanks for your help!

testing

can I enter 5 statements instead of the CSV file (description) and the model give me the images of those statements??

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.