Coder Social home page Coder Social logo

dirtyharrylyl / symnet Goto Github PK

View Code? Open in Web Editor NEW
50.0 8.0 5.0 2.3 MB

As a part of the HAKE project (HAKE-Object), code for SymNet (CVPR'20 and TPAMI'21).

License: Apache License 2.0

Shell 0.05% Python 99.95%
attribute-object compositional-zero-shot-learning deep-learning object-recognition

symnet's Introduction

SymNet

As a part of HAKE project (HAKE-Object).

News: (2022.12.19) HAKE 2.0 is accepted by TPAMI!

(2022.12.7) We release a new project OCL (paper). Data and code are coming soon.

(2022.11.19) We release the interactive object bounding boxes & classes in the interactions within AVA dataset (2.1 & 2.2)! HAKE-AVA, [Paper].

(2022.03.28) We release the code of multiple attribute recognition mentioned in PAMI version

(2022.02.14) We release the human body part state labels based on AVA: HAKE-AVA.

(2021.10.06) Our extended version of SymNet is accepted by TPAMI! Paper and code are coming soon.

(2021.2.7) Upgraded HAKE-Activity2Vec is released! Images/Videos --> human box + ID + skeleton + part states + action + representation. [Description], Full demo: [YouTube], [bilibili]

(2020.6.16) Our larger version HAKE-Large (>120K images, activity and part state labels) is released!

This is the code accompanying our CVPR'20 and TPAMI'21 papers: Symmetry and Group in Attribute-Object Compositions report, Learning Single/Multi-Attribute of Object with Symmetry and Group report

Overview

If you find this repository useful for you, please consider citing our paper.

---SymNet-PAMI
@article{li2021learning,
  title={Learning Single/Multi-Attribute of Object with Symmetry and Group},
  author={Li, Yong-Lu and Xu, Yue and Xu, Xinyu and Mao, Xiaohan and Lu, Cewu},
  journal={TPAMI},
  year={2021}
}
---SymNet-CVPR
@inproceedings{li2020symmetry,
	title={Symmetry and Group in Attribute-Object Compositions},
	author={Li, Yong-Lu and Xu, Yue and Mao, Xiaohan and Lu, Cewu},
	booktitle={CVPR},
	year={2020}
}

Prerequisites

Packages: Install using pip install -r requirements.txt

Datasets: Download and re-arrange with:

cd data; bash download_data.sh

Features and pretrained models: Features for compositional ZSL (CZSL) setting[1] will be downloaded together with the datasets. Features for generalized compositional ZSL (GCZSL) setting[2] can be extracted using:

python utils/dataset/GCZSL_dataset.py [MIT/UT]

For multiple attribute recognition, we re-organize the metadata of aPY/SUN datasets with pre-extracted ResNet-50 feature in 4 files {APY/SUN}_{train/test}.pkl. You can download them from Link and put them into ./data folder.

Pretrained models and intermediate results can be downloaded from here: Link. Please unzip the obj_scores.zip to ./data/obj_scores and weights.zip to ./weights.

Compositional Zero-shot Leaning (CZSL)

These are commands for the split and evaluation metrics introduced by [1].

Training a object classifier

Before training a SymNet model, train an object classifier by running:

python run_symnet.py --network fc_obj --name MIT_obj_lr3e-3 --data MIT --epoch 1500 --batchnorm --lr 3e-3
python run_symnet.py --network fc_obj --name UT_obj_lr1e-3 --data UT --epoch 300 --batchnorm --lr 1e-3

Then store the intermediate object results:

python test_obj.py --network fc_obj --name MIT_obj_lr3e-3 --data MIT --epoch 1120 --batchnorm
python test_obj.py --network fc_obj --name UT_obj_lr1e-3 --data UT --epoch 140 --batchnorm

The results file will be stored in ./data/obj_scores with names MIT_obj_lr3e-3_ep1120.pkl and UT_obj_lr1e-3_ep140.pkl (in the examples above).

Training a SymNet

To train a SymNet with the hyper-parameters in our paper, run:

python run_symnet.py --name MIT_best --data MIT --epoch 400 --obj_pred MIT_obj_lr3e-3_ep1120.pkl --batchnorm --lr 5e-4 --bz 512 --lambda_cls_attr 1 --lambda_cls_obj 0.01 --lambda_trip 0.03 --lambda_sym 0.05 --lambda_axiom 0.01
python run_symnet.py --name UT_best --data UT --epoch 700 --obj_pred UT_obj_lr1e-3_ep140.pkl --batchnorm  --wordvec onehot  --lr 1e-4 --bz 256 --lambda_cls_attr 1 --lambda_cls_obj 0.5 --lambda_trip 0.5 --lambda_sym 0.01 --lambda_axiom 0.03

Model Evaluation

python test_symnet.py --name MIT_best --data MIT --epoch 320 --obj_pred MIT_lr3e-3_ep1120.pkl --batchnorm
python test_symnet.py --name UT_best --data UT --epoch 600 --obj_pred UT_lr1e-3_ep140.pkl --wordvec onehot --batchnorm
Method MIT (top-1) MIT (top-2) MIT (top-2) UT (top-1) UT (top-2) UT (top-3)
Visual Product 9.8/13.9 16.1 20.6 49.9 / /
LabelEmbed (LE) 11.2/13.4 17.6 22.4 25.8 / /
~- LEOR 4.5 6.2 11.8 / / /
~- LE + R 9.3 16.3 20.8 / / /
~- LabelEmbed+ 14.8* / / 37.4 / /
AnalogousAttr 1.4 / / 18.3 / /
Red Wine 13.1 21.2 27.6 40.3 / /
AttOperator 14.2 19.6 25.1 46.2 56.6 69.2
TAFE-Net 16.4 26.4 33.0 33.2 / /
GenModel 17.8 / / 48.3 / /
SymNet (Ours) 19.9 28.2 33.8 52.1 67.8 76.0

Generalized Compositional Zero-shot Leaning (GCZSL)

These are commands for the split and evaluation metrics introduced by [2].

Training a object classifier

python run_symnet.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --lr 3e-3 --epoch 1000 --batchnorm --fc_cls 1024

python run_symnet.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --lr 1e-3 --epoch 700 --batchnorm  --fc_cls 1024			

To store the object classification results of both valid and test set, run:

python test_obj.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --epoch 980 --batchnorm --fc_cls 1024 --test_set val
python test_obj.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --epoch 980 --batchnorm --fc_cls 1024 --test_set test

python test_obj.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --epoch 660 --batchnorm  --fc_cls 1024 --test_set val
python test_obj.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --epoch 660 --batchnorm  --fc_cls 1024 --test_set test

Trainig a SymNet

To train a SymNet for GCZSL, run:

python run_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_val_ep980.pkl --test_set val --lr 3e-4 --bz 512 --test_bz 512 --batchnorm  --lambda_cls_attr 1 --lambda_cls_obj 0.01 --lambda_trip 1 --lambda_sym 0.02 --lambda_axiom 0.02 --triplet_margin 0.3

python run_symnet_gczsl.py --data UTg --name UTg_best --epoch 300 --obj_pred UTg_obj_lr1e-3_val_ep660.pkl --test_set val --lr 1e-3 --bz 512 --test_bz 512 --wordvec onehot --batchnorm --lambda_cls_attr 1 --lambda_cls_obj 0.01 --fc_compress 512 --lambda_trip 1 --lambda_sym 0.02 --lambda_axiom 0.01

Model Evaluation

python test_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_test_ep980.pkl --bz 512 --test_bz 512 --batchnorm  --triplet_margin 0.3 --test_set test --topk 1
python test_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_val_ep980.pkl --bz 512 --test_bz 512 --batchnorm  --triplet_margin 0.3 --test_set val --topk 1

python test_symnet_gczsl.py --data UTg --name UTg_best --epoch 290 --obj_pred UTg_obj_lr1e-3_test_ep660.pkl --bz 512 --test_bz 512 --batchnorm --wordvec onehot --fc_compress 512 --test_set test --topk 1
python test_symnet_gczsl.py --data UTg --name UTg_best --epoch 290 --obj_pred UTg_obj_lr1e-3_val_ep660.pkl --bz 512 --test_bz 512 --batchnorm --wordvec onehot --fc_compress 512 --test_set val --topk 1

MIT-States evaluation results (with metrics of TMN[2])

Model Val Top-1 AUC Val Top-2 AUC Val Top-3 AUC Test Top-1 AUC Test Top-2 AUC Test Top-3 AUC Seen Unseen HM
AttOperator 2.5 6.2 10.1 1.6 4.7 7.6 14.3 17.4 9.9
Red Wine 2.9 7.3 11.8 2.4 5.7 9.3 20.7 17.9 11.6
LabelEmbed+ 3.0 7.6 12.2 2.0 5.6 9.4 15.0 20.1 10.7
GenModel 3.1 6.9 10.5 2.3 5.7 8.8 24.8 13.4 11.2
TMN 3.5 8.1 12.4 2.9 7.1 11.5 20.2 20.1 13.0
SymNet (CVPR) 4.3 9.8 14.8 3.0 7.6 12.3 24.4 25.2 16.1
SymNet (TPAMI) 5.4 11.6 16.6 4.5 10.1 15.0 26.2 26.3 16.8
SymNet (Latest Update) 5.8 12.2 17.8 5.3 11.3 16.5 29.5 26.1 17.4

UT-Zappos evaluation results (with metrics of CAUSAL[3])

Model Unseen Seen Harmonic Closed AUC
LabelEmbed 16.2 53.0 24.7 59.3 22.9
AttOperator 25.5 37.9 27.9 54.0 22.1
TMN 10.3 54.3 17.4 62.0 25.4
CAUSAL 28.0 37.0 30.6 58.6 26.4
SymNet (Ours) 10.3 56.3 24.1 58.7 26.8

Multiple Attribute Recognition

Trainig a SymNet

To train a SymNet for multiple attribute recognition, run:

python run_symnet_multi.py --name APY_best --data APY --rmd_metric sigmoid --fc_compress 256 --rep_dim 128  --test_freq 1  --epoch 100 --batchnorm --lr 3e-3 --bz 128 --lambda_cls_attr 1 --lambda_trip 1 --lambda_sym 5e-2 --bce_neg_weight 0.05 --lambda_cls_obj 5e-2 --lambda_axiom 1e-3  --lambda_multi_rmd 5e-2  --lambda_atten 1

python run_symnet_multi.py --name SUN_best --data SUN --rmd_metric rmd --fc_compress 1536 --rep_dim 128 --test_freq 5 --epoch 150 --batchnorm --lr 5e-3 --bz 128  --lambda_cls_attr 1 --lambda_trip 5e-2 --lambda_sym 8e-3 --bce_neg_weight 0.4 --lambda_cls_obj 3e-1 --lambda_axiom 1e-3 --lambda_multi_rmd 6e-2 --lambda_atten 6e-1

Model Evaluation

python test_symnet_multi.py --data APY --name APY_best --epoch 78 --batchnorm --rep_dim 128 --fc_compress 256
python test_symnet_multi.py --data SUN --name SUN_best --epoch 95 --batchnorm --rep_dim 128 --fc_compress 1536

Evaluation results on aPY and SUN (with metrics of mAUC)

Model aPY SUN
ALE 69.2 74.5
HAP 58.2 76.7
UDICA 82.3 85.8
KDICA 84.7 /
UMF 79.7 80.5
AMT 84.5 82.5
FMT 70.5 75.5
GALM 84.2 86.5
SymNet (Ours) 86.1 88.4

Tips

Use Customized Dataset

Take UT as example, beside reorganizing the images to data/ut-zap50k-original/images/[attribute]_[object]/:

  • If you are using customized pairs composed by our provided attributes and objects, only the pair lists in data/ut-zap50k-original/compositional-split/ need to be updated.

  • If you also use customized attributes and objects, there are several additional files to modify in folder utils/aux_data/:

    1. UT_attrs.json and UT_objs.json are attribute and object list, stored as dict. The keys are original names and values are names in pre-trained GloVe vocabs.

    2. glove_UT.py contains GloVe vectors for the attributes and objects. In our paper, glove.6B.300d.txt is used.

    3. UT_weight.py contains loss weights for each individual attribute or object class (only attr_weight and obj_weight) (pair_weight is never used and can be set to 1). In practice, these weights can help the training on imbalanced data. Each weight is computed by -log(p), where p is the occurrence frequency of an attribute or object in train set. E.g. a five-image dataset have attribute labels [a,a,a,b,b], then the attr_weight for a and b is [-log0.6, -log0.4]. You may clip the values to prevent large or zero weights.

Acknowledgement

The dataloader and evaluation code are based on Attributes as Operators[1] and Task-Driven Modular Networks[2].

Reference

[1] Attributes as Operators: Factorizing Unseen Attribute-Object Compositions

[2] Task-Driven Modular Networks for Zero-Shot Compositional Learning

[3] A causal view of compositional zero-shot recognition

symnet's People

Contributors

dirtyharrylyl avatar enlighten0707 avatar foruck avatar silicx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

symnet's Issues

Why train with pre-computed features?

Thanks for the great work and for making your code accessible. I have one question: can you explain why did you train with precomputed features? Why don't you fix the feature extractor and allow for image augmentation during training?

I understand that using precomputed features allows for faster training and exploration, but then you lose the benefits that image augmentation provides. I also see that you have new results on MIT-States (GCZSL setting) which is much better than what you reported in your paper (improve from 3.0 Top-1 AUC to 4.5). Did you allow for image augmentation to obtain these new results on MIT-States? If not, what has been changed to obtain these new results?

Thank you!

Question about separate object prediction

Thanks for your wonderful work.
As I understood in inference for detecting objects you guys are using a separate dedicatedly trained object detector. But your main model contains an object detector related to loss, Lcls.
Is there any reason to use a separate object detector's predictions?

Results on Multi-attribute datasets

Hello and congratulations for your nice codebase. I am trying to replicate your results on aPY and SUN, using the arguments you use, according to the README file. However, for aPY, I am getting a value of mAUC of around 83 instead of 86 as you have in the README file.

Possible Missing Code

Dear authors,

After following the instructions and successfully running "python run_symnet.py --network fc_obj --name UT_obj_lr1e-3 --data UT --epoch 300 --batchnorm --lr 1e-3", I encountered the following error when running "python test_obj.py --network fc_obj --name UT_obj_lr1e-3 --data UT --epoch 140 --batchnorm"---

Traceback (most recent call last):
File "E:/SymNet-master/test_obj.py", line 140, in
main()
File "E:/SymNet-master/test_obj.py", line 41, in main
sw.trainval_model(sess)
File "E:/SymNet-master/test_obj.py", line 112, in trainval_model
pred = self.network.test_step(sess, batch, prob_op, no_postprocess=True)
TypeError: test_step() got an unexpected keyword argument 'no_postprocess'

Process finished with exit code 1

I took a look at the test_step() function inside base_network.py, and it seems like this no_postprocess parameter should be there but was forgotten? It would be great if the authors could look into this issue. Thanks!

Bias for Prediction of Unseen Classes

I had a question not regarding the code but experiment settings. The thing that eludes me about the compositional zero shot methods is, how do you decide what bias to report the results on? Or are the results reported and compared considering zero bias?
Thanks in anticipation.

Using a Different Set of Data Pairs

Dear authors,

Thanks a lot for the brillant repo! As I understand, currently the repo is limited to the train and test pairs provided in compositional-split, primarily due to the utils/aux_data/UT_weight.py file (using UT as an example). Is it possible to evaluate the model on custom-defined train_pairs.txt and test_pairs.txt? Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.