aimagelab / vkd Goto Github PK
View Code? Open in Web Editor NEWPyTorch code for ECCV 2020 paper: "Robust Re-Identification by Multiple Views Knowledge Distillation"
License: MIT License
PyTorch code for ECCV 2020 paper: "Robust Re-Identification by Multiple Views Knowledge Distillation"
License: MIT License
Hello!
I would like to transform a dataset after the DukeMTMC-VideoReID structure used in this project and to just plug it in. However, it seems that the structure of Duke used in this project is not the same as https://github.com/Yu-Wu/DukeMTMC-VideoReID. Could you provide more details on how you structured the Duke dataset before giving it to the network?
Thank you!
(base) ➜ Downloads unzip distilled.zip
Archive: distilled.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of distilled.zip or
distilled.zip.zip, and cannot find distilled.zip.ZIP, period.
(base) ➜ Downloads
~ 7z x distilled.zip
7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,16 CPUs 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz (A0671),ASM,AES-NI)
Scanning the drive for archives:
1 file, 2079948095 bytes (1984 MiB)
Extracting archive: distilled.zip
ERRORS:
Headers Error
Unconfirmed start of archive
WARNINGS:
There are data after the end of archive
--
Path = distilled.zip
Type = zip
ERRORS:
Headers Error
Unconfirmed start of archive
WARNINGS:
There are data after the end of archive
Physical Size = 91999574
Tail Size = 1987948521
ERROR: CRC Failed : distilled_public/duke/crossdistill/distill_duke_resnet101_to_resnet34/chk/chk_di_1
Sub items Errors: 1
Archives with Errors: 1
Warnings: 1
Open Errors: 1
Sub items Errors: 1
(base) ➜ Downloads sudo apt-get install fastjar
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
chromium-codecs-ffmpeg-extra fonts-open-sans gir1.2-goa-1.0 libceres1 libfwupdplugin1 libpython2-dev libqrcodegencpp1 librlottie0-1 libxcb-screensaver0 libxxhash0 python3-cached-property python3-docker
python3-dockerpty python3-docopt python3-jsonschema python3-pyrsistent python3-texttable python3-websocket
Use 'sudo apt autoremove' to remove them.
The following NEW packages will be installed:
fastjar
0 upgraded, 1 newly installed, 0 to remove and 295 not upgraded.
Need to get 66,7 kB of archives.
After this operation, 175 kB of additional disk space will be used.
0% [Working]
Get:1 http://ru.archive.ubuntu.com/ubuntu focal/universe amd64 fastjar amd64 2:0.98-6build1 [66,7 kB]
Fetched 66,7 kB in 6s (12,1 kB/s)
Selecting previously unselected package fastjar.
(Reading database ... 445500 files and directories currently installed.)
Preparing to unpack .../fastjar_2%3a0.98-6build1_amd64.deb ...
Unpacking fastjar (2:0.98-6build1) ...
Setting up fastjar (2:0.98-6build1) ...
Processing triggers for install-info (6.7.0.dfsg.2-5) ...
Processing triggers for man-db (2.9.1-1) ...
(base) ➜ Downloads jar xvf distilled.zip
inflated: distilled_public/duke/crossdistill/distill_duke_resnet101_to_mobilenet/chk/chk_di_1
inflated: distilled_public/duke/crossdistill/distill_duke_resnet101_to_mobilenet/params/hparams.json
inflated: distilled_public/duke/crossdistill/distill_duke_resnet101_to_mobilenet/params/params.json
Error inflating file! (-3)
How to conduct cross-architecture transfer? What is the command? or what codes need I revise? Thanks.
Thank you for your great job!
In Figure 3, you use the multi-shot feature fusion of multi-camera for testing. I would like to ask you for a feature fusion of a multi-camera. Do you exclude all the cameras of the same id that participate in the fusion in the gallery? Or just remove a camera?
Hi Authors, Thanks for the nice work and sharing the GitHub code!
I am trying to reproduce the results of Table 7 for Amur Tiger for animal re-identification task. The command.txt of the released code has commands for Image To Video and Video To Video settings. It can be a great help to let me know how to use the GitHub repository for the Image 2 Image setting.
Also, in the tools/ subdirectory, there are two training files, train_v2v.py, and train_distill.py. Which of these two files should I use for the Image2Image setting? If possible, please let me know the details of the hyperparameter used for the same.
Thanks again!
what is the meaning of p and k in the training command line?:
python ./tools/train_distill.py mars ./logs/base_mars_resnet50 --exp_name distill_mars_resnet50 --p 12 --k 4 --step_milestone 150 --num_epochs 500
And what relationships between these to parameters with the N and M in your paper?
How to evaluate I2V using one network, and where are the codes?
Hi!
There are very few details about training and testing in your article. Do you have any supplementary materials?If not, can you describe it to me in detail? Thanks!
yours
How many GPUs are used for training? I used 8 gpus, and the results are not good.
Hello,I'm confuse about eval part.First,when i use "python ./tools/eval.py mars ./logs/distilled_public/mars/selfdistill/distill_mars_resnet50 --trinet_chk_name chk_di_1",it can show the result as the table(top1,map...)
But when I want to eval resnet34 and change it to"python ./tools/eval.py mars ./logs/distilled_public/mars/selfdistill/distill_mars_resnet34 --trinet_chk_name chk_di_1" ,
it'll show size mismatch wrong.Can any one help me with the problem?Thanks a lot!
Here is the wrong message:
Traceback (most recent call last):
File "/home/kingsman/.local/share/JetBrains/Toolbox/apps/PyCharm-C/ch-0/203.7148.72/plugins/python-ce/helpers/pydev/pydevd.py", line 1477, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/kingsman/.local/share/JetBrains/Toolbox/apps/PyCharm-C/ch-0/203.7148.72/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/kingsman/VKD-master/tools/eval.py", line 219, in
main()
File "/home/kingsman/VKD-master/tools/eval.py", line 210, in main
net.load_state_dict(state_dict)
File "/home/kingsman/anaconda3/envs/yolact/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for TriNet:
Missing key(s) in state_dict: "backbone.features_layers.1.0.0.conv3.weight", "backbone.features_layers.1.0.0.bn3.weight", "backbone.features_layers.1.0.0.bn3.bias", "backbone.features_layers.1.0.0.bn3.running_mean", "backbone.features_layers.1.0.0.bn3.running_var", "backbone.features_layers.1.0.0.downsample.0.weight", "backbone.features_layers.1.0.0.downsample.1.weight", "backbone.features_layers.1.0.0.downsample.1.bias", "backbone.features_layers.1.0.0.downsample.1.running_mean", "backbone.features_layers.1.0.0.downsample.1.running_var", "backbone.features_layers.1.0.1.conv3.weight", "backbone.features_layers.1.0.1.bn3.weight", "backbone.features_layers.1.0.1.bn3.bias", "backbone.features_layers.1.0.1.bn3.running_mean", "backbone.features_layers.1.0.1.bn3.running_var", "backbone.features_layers.1.0.2.conv3.weight", "backbone.features_layers.1.0.2.bn3.weight", "backbone.features_layers.1.0.2.bn3.bias", "backbone.features_layers.1.0.2.bn3.running_mean", "backbone.features_layers.1.0.2.bn3.running_var", "backbone.features_layers.2.0.0.conv3.weight", "backbone.features_layers.2.0.0.bn3.weight", "backbone.features_layers.2.0.0.bn3.bias", "backbone.features_layers.2.0.0.bn3.running_mean", "backbone.features_layers.2.0.0.bn3.running_var", "backbone.features_layers.2.0.1.conv3.weight", "backbone.features_layers.2.0.1.bn3.weight", "backbone.features_layers.2.0.1.bn3.bias", "backbone.features_layers.2.0.1.bn3.running_mean", "backbone.features_layers.2.0.1.bn3.running_var", "backbone.features_layers.2.0.2.conv3.weight", "backbone.features_layers.2.0.2.bn3.weight", "backbone.features_layers.2.0.2.bn3.bias", "backbone.features_layers.2.0.2.bn3.running_mean", "backbone.features_layers.2.0.2.bn3.running_var", "backbone.features_layers.2.0.3.conv3.weight", "backbone.features_layers.2.0.3.bn3.weight", "backbone.features_layers.2.0.3.bn3.bias", "backbone.features_layers.2.0.3.bn3.running_mean", "backbone.features_layers.2.0.3.bn3.running_var", "backbone.features_layers.3.0.0.conv3.weight", "backbone.features_layers.3.0.0.bn3.weight", "backbone.features_layers.3.0.0.bn3.bias", "backbone.features_layers.3.0.0.bn3.running_mean", "backbone.features_layers.3.0.0.bn3.running_var", "backbone.features_layers.3.0.1.conv3.weight", "backbone.features_layers.3.0.1.bn3.weight", "backbone.features_layers.3.0.1.bn3.bias", "backbone.features_layers.3.0.1.bn3.running_mean", "backbone.features_layers.3.0.1.bn3.running_var", "backbone.features_layers.3.0.2.conv3.weight", "backbone.features_layers.3.0.2.bn3.weight", "backbone.features_layers.3.0.2.bn3.bias", "backbone.features_layers.3.0.2.bn3.running_mean", "backbone.features_layers.3.0.2.bn3.running_var", "backbone.features_layers.3.0.3.conv3.weight", "backbone.features_layers.3.0.3.bn3.weight", "backbone.features_layers.3.0.3.bn3.bias", "backbone.features_layers.3.0.3.bn3.running_mean", "backbone.features_layers.3.0.3.bn3.running_var", "backbone.features_layers.3.0.4.conv3.weight", "backbone.features_layers.3.0.4.bn3.weight", "backbone.features_layers.3.0.4.bn3.bias", "backbone.features_layers.3.0.4.bn3.running_mean", "backbone.features_layers.3.0.4.bn3.running_var", "backbone.features_layers.3.0.5.conv3.weight", "backbone.features_layers.3.0.5.bn3.weight", "backbone.features_layers.3.0.5.bn3.bias", "backbone.features_layers.3.0.5.bn3.running_mean", "backbone.features_layers.3.0.5.bn3.running_var", "backbone.features_layers.4.0.0.conv3.weight", "backbone.features_layers.4.0.0.bn3.weight", "backbone.features_layers.4.0.0.bn3.bias", "backbone.features_layers.4.0.0.bn3.running_mean", "backbone.features_layers.4.0.0.bn3.running_var", "backbone.features_layers.4.0.1.conv3.weight", "backbone.features_layers.4.0.1.bn3.weight", "backbone.features_layers.4.0.1.bn3.bias", "backbone.features_layers.4.0.1.bn3.running_mean", "backbone.features_layers.4.0.1.bn3.running_var", "backbone.features_layers.4.0.2.conv3.weight", "backbone.features_layers.4.0.2.bn3.weight", "backbone.features_layers.4.0.2.bn3.bias", "backbone.features_layers.4.0.2.bn3.running_mean", "backbone.features_layers.4.0.2.bn3.running_var".
size mismatch for backbone.features_layers.1.0.0.conv1.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]).
size mismatch for backbone.features_layers.1.0.1.conv1.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for backbone.features_layers.1.0.2.conv1.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for backbone.features_layers.2.0.0.conv1.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]).
size mismatch for backbone.features_layers.2.0.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 1, 1]).
size mismatch for backbone.features_layers.2.0.0.downsample.1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.features_layers.2.0.0.downsample.1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.features_layers.2.0.0.downsample.1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.features_layers.2.0.0.downsample.1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.features_layers.2.0.1.conv1.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for backbone.features_layers.2.0.2.conv1.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for backbone.features_layers.2.0.3.conv1.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for backbone.features_layers.3.0.0.conv1.weight: copying a param with shape torch.Size([256, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]).
size mismatch for backbone.features_layers.3.0.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 512, 1, 1]).
size mismatch for backbone.features_layers.3.0.0.downsample.1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.features_layers.3.0.0.downsample.1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.features_layers.3.0.0.downsample.1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.features_layers.3.0.0.downsample.1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.features_layers.3.0.1.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for backbone.features_layers.3.0.2.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for backbone.features_layers.3.0.3.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for backbone.features_layers.3.0.4.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for backbone.features_layers.3.0.5.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for backbone.features_layers.4.0.0.conv1.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]).
size mismatch for backbone.features_layers.4.0.0.downsample.0.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 1024, 1, 1]).
size mismatch for backbone.features_layers.4.0.0.downsample.1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.features_layers.4.0.0.downsample.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.features_layers.4.0.0.downsample.1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.features_layers.4.0.0.downsample.1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.features_layers.4.0.1.conv1.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 2048, 1, 1]).
size mismatch for backbone.features_layers.4.0.2.conv1.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 2048, 1, 1]).
size mismatch for classifier.bottleneck.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for classifier.bottleneck.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for classifier.bottleneck.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for classifier.bottleneck.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for classifier.classifier.weight: copying a param with shape torch.Size([625, 512]) from checkpoint, the shape in current model is torch.Size([625, 2048]).
python-BaseException
terminate called without an active exception
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)
I am confused for the evaluation codes:
for idx_iteration in range(args.num_generations):
print(f'starting generation {idx_iteration+1}')
print('#'*100)
teacher_net = d_trainer(teacher_net, student_net)
d_trainer.evaluate(teacher_net)
teacher_net.teacher_mode()
student_net = deepcopy(teacher_net)
saver.save_net(student_net, f'chk_di_{idx_iteration + 1}')
student_net.reinit_layers(args.reinit_l4, args.reinit_l3)
Do you use student network or teacher network for evaluation?
Hi, guys:
First of all, thank you for your outstanding work, but when I was training the teacher network, I encountered such a problem.
python train_v2v.py mars --backbone resnet50 --num_train_images 8 --p 4 --k 4 --exp_name base_mars_resnet50 --first_milestone 100 --step_milestone 100
EXP_NAME: base_mars_resnet50
Traceback (most recent call last):
File "train_v2v.py", line 125, in
main()
File "train_v2v.py", line 96, in main
triplet_loss_batch = triplet_loss(embeddings, y)
File "/home/fei/code/VKD/model/loss.py", line 208, in call
return super(OnlineTripletLoss, self).call(*args, **kwargs)
File "/home/fei/anaconda3/envs/VKD/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/fei/code/VKD/model/loss.py", line 146, in forward
negative_mask = same_id_mask ^ 1
RuntimeError: result type Long can't be cast to the desired output type Bool
How can I fix it ?
Hello,
I am trying to reproduce the results on Google Collab. I followed the instructions, but maybe I have mistaken something:
I have all the needed files on a Google Drive and, therefore, the datasets structure looks like this: VKD-master/datasets/mars and there are 3 folders (info, bbox_train, bbox_test).
However, when I start running the training command, I get the following error:
"RuntimeError: stack expects a non-empty TensorList"
I have attached a file of the entire log. Also, when I tried on the pre-trained model, the same error occurred.
Any change I do, brings me back to this same error. Please, guide me towards a solution. Any idea is welcomed.
Thank you,
Anca
Can you upload the code that draws the heatmap in your paper, thanks.
What GPUs were used to train this? I would like to know what is the minimum recommended setup.
I am running this with a GeForce RTX 2080 Ti and, following the instructions, when I run
python ./tools/train_v2v.py mars --backbone resnet50 --num_train_images 8 --p 8 --k 4 --exp_name base_mars_resnet50 --first_milestone 100 --step_milestone 100
I get the following error:
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 10.76 GiB total capacity; 9.69 GiB already allocated; 29.75 MiB free; 197.14 MiB cached)
I have nothing else running and my GPU is 100% idle.
Hi~thanks for your code first. Can you offer trained model on DukeMTMC-VID dataset based ResVKD-50bam?
I'm sorry I have little knownledge about video reid.
What form is the input?Is it a sequential input?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.