Coder Social home page Coder Social logo

fact-2022's Introduction

FACT 2022

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

Original codebase from https://github.com/wbw520/scouter

Python files changed in codebase

engine.py (significantly)
train.py
test.py
dataset\ConText.py              (to retrieve the height and width of Imagenet images)
sloter\slot_model.py            (to get the attention map)
sloter\utils\slot_attention.py  (to get the attention map)

New python files

get_results.py
restruct_imgnet.py   (to restructure the ILSVRC ImageNet dataset)

Imported python files from other papers (https://arxiv.org/abs/1806.07421, https://arxiv.org/abs/1901.09392)

IAUC_DAUC_eval.py       (to get the IAUC and DAUC scores)
IAUC_DAUC_eval_utils.py
infid_sen_utils.py      (to get sensitivity)

If there's any questions about visualization or metrics, feel free to e-mail [email protected]

Requirements

You can install the packages needed for this project by running:

pip install -r requirements.txt

This project uses 2 datasets, ImageNet and CUB-200. In the results.ipynb file there are instructions on how to download them.

Training

NOTE: Training a model can take up to several hours.

Imagenet

For training ImageNet for 100 categories, a high-RAM GPU is needed. We used Google Cloud to train the ImageNet models.

Pretrain the FC ResNest26 backbone for 100 categories
python train.py --dataset ImageNet --model resnest26d --batch_size 70 --epochs 20 \
--num_classes 100 --use_slot false --vis false --channel 2048 --freeze_layers 0 \
--dataset_dir data/imagenet/ILSVRC/Data/CLS-LOC/
Train the positive Scouter for 100 categories with lambda 10
python train.py --dataset ImageNet --model resnest26d --batch_size 70 --epochs 20 \
--num_classes 100 --use_slot true --use_pre false --loss_status 1 --slots_per_class 1 --output_dir lambda_3/ \
--power 2 --num_workers 4 --to_k_layer 3 --lambda_value 10 --vis false --channel 2048 --freeze_layers 0 \
--dataset_dir data/imagenet/ILSVRC/Data/CLS-LOC/
Train the negative Scouter for 100 categories with lambda 10
python train.py --dataset ImageNet --model resnest26d --batch_size 70 --epochs 20 \
--num_classes 10 --use_slot true --use_pre false --loss_status -1 --slots_per_class 1 \
--power 2 --to_k_layer 3 --lambda_value 10 --vis false --channel 2048 --freeze_layers 0 \
--dataset_dir data/imagenet/ILSVRC/Data/CLS-LOC/

You can enable distributed training using the following arguments in your commands:

python -m torch.distributed.launch --nproc_per_node=4 --use_env train.py --world_size 4

CUB-200 Dataset

Pre-training FC ResNest50 backbone (50 categories)
python train.py --dataset CUB200 --model resnest50d --num_workers 0 --batch_size 16 --epochs 150 \
--num_classes 50 --use_slot false --vis false --channel 2048 \
--dataset_dir data/CUB200/CUB_200_2011
Pre-training FC ResNest26 backbone (100 categories)
python train.py --dataset CUB200 --model resnest26d --batch_size 64 --epochs 150 \
--num_classes 100 --use_slot false --vis false --channel 2048 --num_workers 4 \
--dataset_dir data/CUB200/CUB_200_2011
Positive Scouter on CUB-200 (50 categories)
python train.py --dataset CUB200 --model resnest50d --batch_size 16 --epochs 150 \
--num_classes 50 --num_workers 2 --use_slot true --use_pre true --loss_status 1 --slots_per_class 5 \
--power 2 --to_k_layer 3 --lambda_value 10 --vis false --channel 2048 --freeze_layers 2 \
--dataset_dir data/CUB200/CUB_200_2011/
Negative Scouter on CUB-200 (50 categories)
python train.py  --dataset CUB200 --num_workers 2 --model resnest50d --batch_size 16 --epochs 150 \
--num_classes 50 --use_slot true --use_pre true --loss_status -1 --slots_per_class 3 \
--power 2 --to_k_layer 3 --lambda_value 1. --vis false --channel 2048 --freeze_layers 2 \
--dataset_dir data/CUB200/CUB_200_2011

The CUB-200 experiments with different number of categories and lambda values have been trained with similar commands by just adjusting the number of classes and/or reducing the number of workers for memory issues.

Generate results and visualization

The results (metrics) and visualizations are produced in the results.ipynb file in this repository. If you don't have a GPU, we strongly recommend to run this notebook on Google Colab (Pro version even better), since a lot of computational power is needed for calculating the area size, precision, IAUC, DAUC and sensitivity metrics.

Instructions on how to download the datasets and model files are in the notebook file.

Acknowledgements

We would like to thank SurfSara for providing us their computational resources.

fact-2022's People

Contributors

bartvanvulpen avatar brambakker avatar jacopo-dm avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.