Coder Social home page Coder Social logo

all-in-one's People

Contributors

fingerrec avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

all-in-one's Issues

video captioning finetuning datasets and codes

您好,我没有找到有关于video captioning的数据准备和微调代码,希望是由于我的疏忽导致的,请问您方便指导一下有关于TVC 和 YouCook2的数据准备和 模型微调训练和测试的代码在什么位置吗?十分感谢。

Video Retrieval MSRVTT train/test split.

Hello!

Could you please tell me which train/test split you used when reporting results in the paper.
I see hardcoded using of jsfusion split in AllInOne/datasets/msrvtt.py. So did you use only jsfusion test for both train-7k and train-9k?

Also note that when you report here you should use 'full' split

LSMDC Fib

Hi, is there any dataset for LSMDC Filling the blank task?

Drive access

Hi! Could you please provide the access to google drive with dataset annotations and model weights?

downstream task checkpoint

Thank you very much for your excellent code sharing, and can you share your downstream tasks checkpoints?

install conflicts. pip install -r requirements.txt

I tried what the README.md have stated.

conda create -n allinone python=3.7
source activate allinone
cd [Path_To_This_Code]
pip install -r requirements.txt

However when pip install -r requirements.txt, the following error occurs.

$ pip install -r requirements.txt
Collecting absl-py==0.13.0
Downloading absl_py-0.13.0-py3-none-any.whl (132 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 132.1/132.1 kB 5.2 MB/s eta 0:00:00
Collecting addict==2.4.0
Downloading addict-2.4.0-py3-none-any.whl (3.8 kB)
Collecting aiohttp==3.8.1
Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 11.9 MB/s eta 0:00:00
Collecting aiosignal==1.2.0
Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)
ERROR: Could not find a version that satisfies the requirement apex==0.1 (from versions: 0.9.8dev.linux-i686, 0.9.8.dev0, 0.9.8a0.dev0, 0.9.9.dev0, 0.9.10.dev0)
ERROR: No matching distribution found for apex==0.1

Can I modify the requirements.txt file?

Fine-tuning TGIF-QA FrameQA

TGIF dataset folder is flawless which is checked by md5.
However, I have an error.

$ python run.py with data_root=DATAROOT num_gpus=1 num_nodes=1 num_frames=3 per_gpu_batchsize=8 task_finetune_tgifqa load_path="pretrained/all-in-one-plus-224.ckpt"

initalize data augmentation for a100 gpus
convert to numpy
^MValidation sanity check: 0it [00:00, ?it/s]ERROR - AllInOne - Failed after 0:00:05!
Traceback (most recent calls WITHOUT Sacred internals):
File "run.py", line 84, in main
trainer.fit(model, datamodule=dm)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 473, in fit
results = self.accelerator_backend.train()
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/accelerators/ddp_accelerator.py", line 152, in train
results = self.ddp_train(process_idx=self.task_idx, model=model)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/accelerators/ddp_accelerator.py", line 305, in ddp_train
results = self.train_or_test()
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 69, in train_or_test
results = self.trainer.train()
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 495, in train
self.run_sanity_check(self.get_model())
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 693, in run_sanity_check
_, eval_results = self.run_evaluation(test_mode=False, max_batches=self.num_sanity_val_batches)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 596, in run_evaluation
for batch_idx, batch in enumerate(dataloader):
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/torch/_utils.py", line 461, in reraise
raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 235, in getitem
return self.datasets[dataset_idx][sample_idx]
File "/myhome/all-in-one/AllInOne/datasets/tgif.py", line 87, in getitem
image_tensor = self.get_video(sample)
File "/myhome/all-in-one/AllInOne/datasets/video_base_dataset.py", line 107, in get_video
imgs = self.get_raw_video(sample).permute(1, 0, 2, 3) # to cthw
File "/myhome/all-in-one/AllInOne/datasets/tgif.py", line 55, in get_raw_video
imgs, idxs, vlen = read_frames_gif(abs_fp, self.num_frames, mode=self.split)
File "/myhome/all-in-one/AllInOne/datasets/video_base_dataset.py", line 292, in read_frames_gif
gif = imageio.get_reader(video_path)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/imageio/core/functions.py", line 186, in get_reader
return format.get_reader(request)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/imageio/core/format.py", line 170, in get_reader
return self.Reader(self, request)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/imageio/core/format.py", line 221, in init
self._open(**self.request.kwargs.copy())
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/imageio/plugins/pillowmulti.py", line 60, in _open
return PillowFormat.Reader._open(self)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/imageio/plugins/pillow.py", line 138, in _open
as_gray=as_gray, is_gray=_palette_is_grayscale(self._im)
File "/myhome/.conda/envs/allinone/lib/python3.7/site-packages/imageio/plugins/pillow.py", line 689, in _palette_is_grayscale
palette = np.asarray(pil_image.getpalette()).reshape((256, 3))
ValueError: cannot reshape array of size 96 into shape (256,3)

401 Client Error: Unauthorized for url: https://huggingface.co/pretrained/bert-base-uncased/resolve/main/vocab.txt

$ python run.py with data_root=/datasets/msvd/data num_gpus=2 num_nodes=1 num_frames=3 per_gpu_batchsize=16 task_finetune_msvdqa load_path="pretrained/all-in-one-base.ckpt"

WARNING - root - Changed type of config entry "max_steps" from int to NoneType
WARNING - AllInOne - No observers have been added to this run
INFO - AllInOne - Running command 'main'
INFO - AllInOne - Started
Global seed set to 0
INFO - lightning - Global seed set to 0

ERROR - AllInOne - Failed after 0:00:01!
Traceback (most recent calls WITHOUT Sacred internals):
File "run.py", line 15, in main
dm = MTDataModule(_config, dist=True)
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py", line 49, in call
obj = type.call(cls, *args, **kwargs)
File "/home/all-in-one/AllInOne/datamodules/multitask_datamodule.py", line 19, in init
self.dm_dicts = {key: _datamoduleskey for key in datamodule_keys}
File "/home/all-in-one/AllInOne/datamodules/multitask_datamodule.py", line 19, in
self.dm_dicts = {key: _datamoduleskey for key in datamodule_keys}
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py", line 49, in call
obj = type.call(cls, *args, **kwargs)
File "/home/all-in-one/AllInOne/datamodules/msvdqa_datamodule.py", line 8, in init
super().init(*args, **kwargs)
File "/home/all-in-one/AllInOne/datamodules/datamodule_base.py", line 57, in init
self.tokenizer = get_pretrained_tokenizer(tokenizer)
File "/home/all-in-one/AllInOne/datamodules/datamodule_base.py", line 20, in get_pretrained_tokenizer
from_pretrained, do_lower_case="uncased" in from_pretrained
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1752, in from_pretrained
raise err
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1745, in from_pretrained
use_auth_token=use_auth_token,
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/transformers/file_utils.py", line 1056, in cached_path
local_files_only=local_files_only,
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/transformers/file_utils.py", line 1186, in get_from_cache
r.raise_for_status()
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/requests/models.py", line 953, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/pretrained/bert-base-uncased/resolve/main/vocab.txt


There are 2 reasons for 401 Client Error for hugging face transformer.

  1. The repository does not exist.
  2. The repository is private.

bert base uncased model is definitely not private.
image.

How could I resolve this issue?

python == 3.7.13
transformers == 4.2.1

add model to Huggingface

Hi, would you be interested in adding all-in-one to Hugging Face Hub? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. We can setup an organization or a user account under which all-in-one can be added similar to github.

Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook

Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/akhaliq/BLIP

github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

How to test the model?

Hi! I want to test video retrieval with all-in-one-base.ckpt on MSR-VTT and see the metrics to compare with the paper. Can you please help with the command?
I tried the following command but it started to train the model and I only need testing.
python run.py with data_root=data/ num_gpus=2 num_nodes=1 per_gpu_batchsize=32 task_finetune_only_ind_itc_msrvtt_randaug num_frames=3 load_path="pretrained/all-in-one-base.ckpt"

Annotation (Caption file) missing for HowTo100M dataset

Dear author,

Thank you for your great effort, especially the very neat & organized code for data loading! That helps a lot for our research. In the meanwhile, I'm wondering do you plan to release the processed caption file for the HowTo100M dataset? As it seems it's not released in the Google Driver. I understand that you must use the official provided caption file but just would like to check with you first as your processed file might be fitting the data loader more easily.

Appreciate your help in advance.

Regards

video captioning checkpoint

Thanks to the awesome work!
I'm interested in video captioning, and can you share the captioning checkpoint?
Thanks a lot

meta_data folder is missing

Error

/all-in-one$ python run.py with data_root/datasets/msvd/data num_gpus=2 num_nodes=1 num_frames=3 per_gpu_batchsize=16 task_finetune_msvdqa load_path="pretrained/all-in-one-base.ckpt"

....

video datasets: ['msvd_qa_train']
frames for base dataset is: 3
no arrow available for msvd_qa_train, load from disk
initalize data augmentation for a100 gpus
initalize data augmentation for a100 gpus
convert to numpy
ERROR - AllInOne - Failed after 0:00:11!
Traceback (most recent calls WITHOUT Sacred internals):
File "run.py", line 84, in main
trainer.fit(model, datamodule=dm)
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 473, in fit
results = self.accelerator_backend.train()
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/accelerators/ddp_accelerator.py", line 152, in train
results = self.ddp_train(process_idx=self.task_idx, model=model)
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/accelerators/ddp_accelerator.py", line 268, in ddp_train
self.trainer.call_setup_hook(model)
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 859, in call_setup_hook
self.datamodule.setup(stage_name)
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py", line 92, in wrapped_fn
return fn(*args, **kwargs)
File "/home/all-in-one/AllInOne/datamodules/multitask_datamodule.py", line 34, in setup
dm.setup(stage)
File "/home/miniconda3/envs/allinone/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py", line 92, in wrapped_fn
return fn(*args, **kwargs)
File "/home/all-in-one/AllInOne/datamodules/msvdqa_datamodule.py", line 19, in setup
super().setup(stage)
File "/home/all-in-one/AllInOne/datamodules/datamodule_base.py", line 157, in setup
self.set_train_dataset()
File "/home/all-in-one/AllInOne/datamodules/datamodule_base.py", line 92, in set_train_dataset
backend=self.backend
File "/home/all-in-one/AllInOne/datasets/msvdqa.py", line 28, in init
self._load_metadata()
File "/home/all-in-one/AllInOne/datasets/msvdqa.py", line 41, in _load_metadata
with open(os.path.join(metadata_dir, 'msvd_youtube_mapping.txt')) as f:
FileNotFoundError: [Errno 2] No such file or directory: './meta_data/msvd/msvd_youtube_mapping.txt'

Hi.

I think meta_data folder is not uploaded. When I check the code, there is no writing/generation for meta_data files, so I guess meta_data files would be included in the repository. As the DemoVLP does.

DemoVLP has meta_data folder. https://github.com/showlab/DemoVLP/tree/master/meta_data

zero-shot on custom dataset

Hi,
I want to know if we could evaluate this model on our custom dataset without fintuning? Or could you show me how to do the inference based on pre-trained ckpt? My task is to do VideoQA and Video-text Retrieval and the format of dataset is quiet similar to MSRVTT. Thanks a lot !

config file set

If an all-in-one-tiny checkpoint is used, how to update the config file to fit the parameter size?

Details of fine-tuning on MSRVTT-QA

Hi,

I am wondering about some of your experimental settings of MSRVTT-QA. Could you please clarify it?

  1. what's the image resolution, 224x224?

  2. how do you deal with open-ended VQA like MSRVTT-QA? the paper only mentioned that you converted it to a classification task. Did you choose top-k answers? what's k then? ​

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.