svip-lab / lbylnet Goto Github PK
View Code? Open in Web Editor NEW[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.
[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.
It seems that BertEncoder will be finetuned and updated during training, which is unfair compared to "Improving one-stage visual grounding by recursive sub-query construction" and "A fast and accurate one- stage approach to visual grounding".
Hi @piaozhx,
Couuld you please tell what should I do to get rid of the error? I am trying to run the training of the model using the command:
CUDA_VISIBLE_DEVICES=0,1 python train.py lbyl_lstm_referit_batch64 --workers 8 --distributed --world_size 1 --dist_url "tcp://127.0.0.1:60006"
but I am getting the following error:
ngpus_per_node 2
/media/disk/user/abhinav/LBYLNet/core/../data/refer/data/referit/corpus.pth
/media/disk/user/abhinav/LBYLNet/core/../data/refer/data/referit/corpus.pth
/media/disk/user/abhinav/LBYLNet/env/lib/python3.6/site-packages/torch/nn/_reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
/media/disk/user/abhinav/LBYLNet/env/lib/python3.6/site-packages/torch/nn/_reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
/media/disk/user/abhinav/LBYLNet/env/lib/python3.6/site-packages/torch/nn/modules/rnn.py:50: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
"num_layers={}".format(dropout, num_layers))
/media/disk/user/abhinav/LBYLNet/env/lib/python3.6/site-packages/torch/nn/modules/rnn.py:50: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
"num_layers={}".format(dropout, num_layers))
Traceback (most recent call last):
File "train.py", line 375, in <module>
mp.spawn(main, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
File "/media/disk/user/abhinav/LBYLNet/env/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/media/disk/user/abhinav/LBYLNet/env/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/media/disk/user/abhinav/LBYLNet/env/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/media/disk/user/abhinav/LBYLNet/train.py", line 300, in main
model = LBYLNet(system_config, config["db"])
File "/media/disk/user/abhinav/LBYLNet/core/models/net/lbylnet.py", line 59, in __init__
self.context_block = context.get(cfg_sys.context)(self.joint_out_dim, mapdim=self.map_dim)
File "/media/disk/user/abhinav/LBYLNet/core/models/context/module.py", line 120, in __init__
self._init_layers(dim)
File "/media/disk/user/abhinav/LBYLNet/core/models/context/module.py", line 137, in _init_layers
from ._pconv.conv4 import TopLeftPool, TopRightPool, BottomLeftPool, BottomRightPool
File "/media/disk/user/abhinav/LBYLNet/core/models/context/_pconv/conv4.py", line 3, in <module>
import landmarks
ModuleNotFoundError: No module named 'landmarks'
Hi @piaozhx,
While trying to train the model, after sometime, training is getting stopped and I am getting the following error at the end:
/usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 32 leaked semaphores to clean up at shutdown
len(cache))
There are some more errors in between as well, but this was the error at the last. The complete logs of the training are attached with the issue.
logs_training.txt
Hello, the download link of the "ReferitGame" dataset is no longer available. Could you please send me a file through other methods? Thank you very much!
Hi,i find pytorch and cuda's version in Readme is different from requirements.txt.
Can you provide pytorch/python/Cuda's version?
Hi,
The AP I reproduced on unc and unc+ is only 30%,but it is normal on referit and gref.I tried to download the dataset and code again, but it still doesn’t work.
All experimental environments :CUDA9.2、pytorch1.7,batch_size=32 and train on a single 1080ti.
Is there anything else I should pay attention to?
Thanks for your great work.
In LBYL/core/models/context/_pconv/conv4.py line 6, "from landmarkconv import _C". How can I get the landmarkconv module?
Thanks
wget -P ext https://pjreddie.com/media/files/yolov3.weights
It seems that it actually does not exclude val/test images on refcoco/+/g datasets, this is unfair, since not removing val/test images during pretraining usually brings about 2-3 improments
I have a question about "TopLeftPool()", "TopRightPool()", "BottomLeftPool()", "BottomRightPool()".
The forward( ) is aim to get the max number for Dynamic Max Pooling,i think this step have updated the input, but why also need a backward( ) function? This function update what?
Will the training process continue to optimize the parameters of DarkNet and Lstm?
Hi,
I've tried loading a few of your provided model files from https://drive.google.com/drive/folders/1ICLArOUtWAx_W9nfn7uwobdtIkmN_RoA
but can't get them to load, neither when using your code (demo and evaluation) nor when just trying to pickle.load() the file normally. There's an error saying (when using your code):
lib/python3.7/site-packages/torch/serialization.py", line 755, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.
Last line is the same when just trying with pickle.load() in regular python. This happens for all your model files I tried. How were those files created? Knowing this might help resolve the issue.
Thanks!
Will the training process continue to optimize the parameters of DarkNet and Lstm?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.