Coder Social home page Coder Social logo

labert's People

Contributors

bearcatt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

labert's Issues

codes of VLP

Hi!
Thank you for the amazing work on LaBERT! I was wondering whether you would release the code for Length-Controllable VLP. In case I try my own implementation, adding length-level embedding to an input word embedding will suffice? Am I right?
BTW, where can I find weights for length-aware VLP or AoANet?
Thanks!

About datasets

thanks for your job sir ! In this work,can i use myself datasets? thank you!

id2captions_test.json file

Hi!

You do evaluation using data/id2captions_test.json and data/caption_results.json files. But I could not find out where the files are. Can you share the files with us?

Thanks,

can't find the region_bbox.h5

Thanks for your amazing job ,i download the dataset from Baidunetdisk you provided ,but when i run the code, in the line88 dataset.py, i cant find the region_bbox.h5, is it in the dataset you provide?

Refinement Steps

Hi,
Great work, thanks for sharing the code.

I could not find the number of refinement steps T in the configuration / code. Does it mean that this code is for autoregressive generation (table 1 instead of 2) only? If I'm mistaken, would you point the parameter that corresponds to T?

Thanks!

Your GPU Spec for Training

Thanks for the great work!
In your paper, what are the GPU spec (e.g. GPU model, number of GPUs, and how many hours of training) for the training?

Inference on raw images

Hi,
Thanks for sharing your interesting work on image captioning. I wanted to run the pretrained model on a few images of mine to test. Wanted to confirm if its this or this that I need to use to create the bounding and boxes and features for my images.
Thanks.

Question about number of region_spatial features

Hi @bearcatt,
Thanks for your sharing code. I met one problem here spatial_region, what are the 5th and 6th first feature about (region_spatial [:,:,[5,6])?
Moreover, from your paper there are five (four for location coordinate and one for relative area) in localization feature(in implementation details section), but in your code here there are six features in last dimension. Can you help me understand this part?

准备COCO数据问题

在VLP页面里,我是只需要下载95GB和79GB的文件就可以训练了么,还需要下载其他数据么

codes for AoA

Hi~
Your codes on LaBert is amazing, I wonder whether you would release the code of implementing non-autogressive length controllable image captioning on AoA as you announced in the paper.
Thanks!

Dataset format

Hi!
Thank you for your last reply.

I have a question about dataset. I did download this link's coco data part1 and part2 and exchanged files name coco_detection_vg_100dets_gvd_checkpoint_trainval_cls(0to999).h5 -> cls(0to999).h5, coco_detection_vg_100dets_gvd_checkpoint_trainval_feat(0to999).h5 -> feat(0to999).h5, and coco_detection_vg_thresh0.2_feat_gvd_checkpoint_trainvaltest.h5 -> region_bbox.h5.
But an error has occurred. (command at terminal : python -m torch.distributed.launch --nproc_per_node=1 --master_port=4396 train.py save_dir result/ samples_per_gpu 1)
image

I add source code train.py at 90 line and dataset.py at 86line so these have prints.
train.py

  • print(pred_scores, gt_token_ids)
    loss = criterion(pred_scores, gt_token_ids)

dataset.py

  • print(name)
    with h5py.File(osp.join(self.root, f'feat{name[-3:]}.h5'), 'r') as features,
    h5py.File(osp.join(self.root, f'cls{name[-3:]}.h5'), 'r') as classes,
    h5py.File(osp.join(self.root, f'region_bbox.h5'), 'r') as bboxes:

Is it right to set up the dataset like this? If not, could you explain the dataset guidelines in more detail?
Thank you for providing good model. :)

Data Download Problem

Hi, I am interested in your great work and wish to follow it in my further experiments but I have a problem when trying to download MSCOCO data.
Since the one drive link provided in VLP cannot be visited in China without a VPN, it's difficult for me to prepare the data on my ubuntu machine.
Do you have any generous advice for me to solve this problem? Or would you please provide another download link that can be easily connected in China for MSCOCO data?
Thank you!

MS COCO dataset download issue from the link provided [VLP github page]

Hey! Really interested in your work and would love to divulge in it more.
I'm facing problems downloading the dataset. Can you please guide me as to where and how I can get the MS COCO dataset of 123k with splits of 113,287 images for training, 5,000 for validation, and 5,000 for offline evaluation?

The link provided for the dataset leads to the VLP github page. The dataset there is a combination of COCO and VQA of 95GB and 72GB. I'm unable to use them. Can you please suggest a way to download MS COCO dataset only.

Missing key(s) in state_dict

Hi,

Thanks for sharing this excellent work.
I encountered an issue when trying to test the inference step.
I downloaded the pretrained generator.pth and bert.pth from google drive.
The error message is attached below.
Could you give me an hint how to solve the problem?
Thanks in advance.

Traceback (most recent call last):
File "inference.py", line 140, in
g_checkpointer.load(config.model_path, True)
File "/data-2/home/xingzheng.xz/Research/LaBERT/utils/checkpointer.py", line 47, in load
self.model.load_state_dict(checkpoint.pop("model"))
File "/home/xingzheng.xz/miniconda3/envs/omninet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 847, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Generator:
Missing key(s) in state_dict: "classifier.decoder.bias", "embedding_layer.position_ids".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.