damo-cv / transreid-ssl Goto Github PK

View Code? Open in Web Editor NEW

167.0 167.0 20.0 135 KB

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

License: MIT License

Python 98.81% Shell 1.19%

transreid-ssl's People

Contributors

Stargazers

Watchers

transreid-ssl's Issues

Can't reproduce MSMT17 performance

Based on README, the performace of ViT-B/16+ICS is 75.1%. But I got 71.2% with MSMT17_V2 dataset. ViT-B/16+ICS is evaluated on MSMT17_V2?

``` 023-07-05 13:54:03 transreid INFO: Namespace(config_file='configs/msmt17/vit_base_ics_384.yml', opts=['MODEL.DEVICE_ID', "('0')"]) 2023-07-05 13:54:03 transreid INFO: Loaded configuration file configs/msmt17/vit_base_ics_384.yml 2023-07-05 13:54:03 transreid INFO: MODEL: PRETRAIN_PATH: '/home/hpds/Repositories/ml-models/proto/TransReID-SSL/checkpoint/vit_base_ics_cfs_lup.pth' PRETRAIN_HW_RATIO: 2 METRIC_LOSS_TYPE: 'triplet' IF_LABELSMOOTH: 'off' IF_WITH_CENTER: 'no' NAME: 'transformer' NO_MARGIN: True DEVICE_ID: ('2') TRANSFORMER_TYPE: 'vit_base_patch16_224_TransReID' STRIDE_SIZE: [16, 16] STEM_CONV: True # False for vanilla ViT-S # DIST_TRAIN: True

INPUT:
SIZE_TRAIN: [384, 128]
SIZE_TEST: [384, 128]
PROB: 0.5 # random horizontal flip
RE_PROB: 0.5 # random erasing
PADDING: 10
PIXEL_MEAN: [0.5, 0.5, 0.5]
PIXEL_STD: [0.5, 0.5, 0.5]

DATASETS:
NAMES: ('MSMT17_V2')
ROOT_DIR: ('/home/hpds/Repositories/ml-models/dataset')

DATALOADER:
SAMPLER: 'softmax_triplet'
NUM_INSTANCE: 4
NUM_WORKERS: 8

SOLVER:
OPTIMIZER_NAME: 'SGD'
MAX_EPOCHS: 120
BASE_LR: 0.0004
WARMUP_EPOCHS: 20
IMS_PER_BATCH: 64
WARMUP_METHOD: 'cosine'
LARGE_FC_LR: False
CHECKPOINT_PERIOD: 120
LOG_PERIOD: 20
EVAL_PERIOD: 120
WEIGHT_DECAY: 1e-4
WEIGHT_DECAY_BIAS: 1e-4
BIAS_LR_FACTOR: 2

TEST:
EVAL: True
IMS_PER_BATCH: 256
RE_RANKING: False
WEIGHT: '/home/hpds/Repositories/ml-models/proto/TransReID-SSL/checkpoint/transformer_120.pth'
NECK_FEAT: 'before'
FEAT_NORM: 'yes'

OUTPUT_DIR: '../../log/transreid/msmt17/vit_base_ics_cfs_lup_384'

2023-07-05 13:54:03 transreid INFO: Running with config:
DATALOADER:
NUM_INSTANCE: 4
NUM_WORKERS: 8
REMOVE_TAIL: 0
SAMPLER: softmax_triplet
DATASETS:
NAMES: MSMT17_V2
ROOT_DIR: /home/hpds/Repositories/ml-models/dataset
ROOT_TRAIN_DIR: ../data
ROOT_VAL_DIR: ../data
INPUT:
PADDING: 10
PIXEL_MEAN: [0.5, 0.5, 0.5]
PIXEL_STD: [0.5, 0.5, 0.5]
PROB: 0.5
RE_PROB: 0.5
SIZE_TEST: [384, 128]
SIZE_TRAIN: [384, 128]
MODEL:
ATT_DROP_RATE: 0.0
COS_LAYER: False
DEVICE: cuda
DEVICE_ID: 0
DEVIDE_LENGTH: 4
DIST_TRAIN: False
DROPOUT_RATE: 0.0
DROP_OUT: 0.0
DROP_PATH: 0.1
FEAT_DIM: 512
GEM_POOLING: False
ID_LOSS_TYPE: softmax
ID_LOSS_WEIGHT: 1.0
IF_LABELSMOOTH: off
IF_WITH_CENTER: no
JPM: False
LAST_STRIDE: 1
METRIC_LOSS_TYPE: triplet
NAME: transformer
NECK: bnneck
NO_MARGIN: True
PRETRAIN_CHOICE: imagenet
PRETRAIN_HW_RATIO: 2
PRETRAIN_PATH: /home/hpds/Repositories/ml-models/proto/TransReID-SSL/checkpoint/vit_base_ics_cfs_lup.pth
REDUCE_FEAT_DIM: False
RE_ARRANGE: True
SHIFT_NUM: 5
SHUFFLE_GROUP: 2
SIE_CAMERA: False
SIE_COE: 3.0
SIE_VIEW: False
STEM_CONV: True
STRIDE_SIZE: [16, 16]
TRANSFORMER_TYPE: vit_base_patch16_224_TransReID
TRIPLET_LOSS_WEIGHT: 1.0
OUTPUT_DIR: ../../log/transreid/msmt17/vit_base_ics_cfs_lup_384
SOLVER:
BASE_LR: 0.0004
BIAS_LR_FACTOR: 2
CENTER_LOSS_WEIGHT: 0.0005
CENTER_LR: 0.5
CHECKPOINT_PERIOD: 120
COSINE_MARGIN: 0.5
COSINE_SCALE: 30
EVAL_PERIOD: 120
GAMMA: 0.1
IMS_PER_BATCH: 64
LARGE_FC_LR: False
LOG_PERIOD: 20
MARGIN: 0.3
MAX_EPOCHS: 120
MOMENTUM: 0.9
OPTIMIZER_NAME: SGD
SEED: 1234
STEPS: (40, 70)
TRP_L2: False
WARMUP_EPOCHS: 20
WARMUP_FACTOR: 0.01
WARMUP_METHOD: cosine
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: 0.0001
TEST:
DIST_MAT: dist_mat.npy
EVAL: True
FEAT_NORM: yes
IMS_PER_BATCH: 256
NECK_FEAT: before
RE_RANKING: False
WEIGHT: /home/hpds/Repositories/ml-models/proto/TransReID-SSL/checkpoint/transformer_120.pth
MSMT17_V2 /home/hpds/Repositories/ml-models/dataset
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} cam_container
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} cam_container
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} cam_container
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} cam_container
=> MSMT17 loaded
2023-07-05 13:54:03 transreid.check INFO: Dataset statistics:
2023-07-05 13:54:03 transreid.check INFO: ----------------------------------------
2023-07-05 13:54:03 transreid.check INFO: subset | # ids | # images | # cameras
2023-07-05 13:54:03 transreid.check INFO: ----------------------------------------
2023-07-05 13:54:03 transreid.check INFO: train | 1041 | 32621 | 15
2023-07-05 13:54:03 transreid.check INFO: query | 3060 | 11659 | 15
2023-07-05 13:54:03 transreid.check INFO: gallery | 3060 | 82161 | 15
2023-07-05 13:54:03 transreid.check INFO: ----------------------------------------
using img_triplet sampler
using Transformer_type: vit_base_patch16_224_TransReID as a backbone
using stride: [16, 16], and patch number is num_y24 * num_x8
Resized position embedding from size:torch.Size([1, 129, 768]) to size: torch.Size([1, 193, 768]) with height:24 width: 8
Load 172 / 174 layers.
Loading pretrained ImageNet model......from /home/hpds/Repositories/ml-models/proto/TransReID-SSL/checkpoint/vit_base_ics_cfs_lup.pth
===========building transformer===========
Loading pretrained model from /home/hpds/Repositories/ml-models/proto/TransReID-SSL/checkpoint/transformer_120.pth
2023-07-05 13:54:05 transreid.test INFO: Enter inferencing
True
torch.cuda.device_count() 1

The test feature is normalized
=> Computing DistMat with euclidean_distance
/home/hpds/Repositories/ml-models/proto/TransReID-SSL/transreid_pytorch/utils/metrics.py:12: UserWarning: This overload of addmm_ is deprecated:
addmm_(Number beta, Number alpha, Tensor mat1, Tensor mat2)
Consider using one of the following signatures instead:
addmm_(Tensor mat1, Tensor mat2, *, Number beta, Number alpha) (Triggered internally at ../torch/csrc/utils/python_arg_parser.cpp:1485.)
dist_mat.addmm_(1, -2, qf, gf.t())
distmat (11659, 82161) <class 'numpy.ndarray'>
2023-07-05 14:04:03 transreid.test INFO: Validation Results
2023-07-05 14:04:03 transreid.test INFO: mAP: 71.2%
2023-07-05 14:04:03 transreid.test INFO: CMC curve, Rank-1 :87.9%
2023-07-05 14:04:03 transreid.test INFO: CMC curve, Rank-5 :93.6%
2023-07-05 14:04:03 transreid.test INFO: CMC curve, Rank-10 :95.1%

有关Youtube视频清晰度的问题

请问下预训练使用的LUPerson数据集在下载时视频清晰度问题是怎么解决的呢？还是说通过CFS筛掉了呢，没有统一的提取脚本实在是很难复现结果orz

CFS

May I know which part of the code includes the CFS ?

RuntimeError: shape '[1, 11, 11, -1]' is invalid for input of size 98304

$K{M_})CYA_I 23A0 J6M_%K$
大佬您好，这个问题不知道是为什么，输入图片的大小是384*125.
求助求助!

请问使用LUPerson进行pretrain时，LUPerson的数据集文件结构是什么样的？

非常感谢您的工作！
我想使用您的DINO代码对LUPerson进行预训练，我看到您的代码好像是将LUPerson里边的数据以图片的形式直接load（main_dino.py line153: dataset = datasets.ImageFolder(args.data_path, transform=transform)）。但是我拿到的LUPerson数据集是.mdb格式的，没办法直接读取。想问下我是否需要将.mdb格式的数据集转换成.jpg图片？如果需要的话，转换后的LUPerson数据集的组织形式是什么样的？（从main_dino.py line158: dir_path = os.path.join(args.data_path,'images') 看到似乎LUPerson文件夹下还有‘images’文件夹）

希望您能帮助我解决这个问题，万分感谢！

this model is overfitting to other dataset ?

I try to implement this model, but the result is very bad when applied to other unknown dataset. Is there any specific setting? or the model is overfitting ?

speed of training

Hi ,when I using
python -W ignore -m torch.distributed.launch --nproc_per_node=8 main_dino.py
--arch vit_small
--data_path /my path/LUP
--output_dir ./log/dino/lup/vit_small_full_lup
--height 256 --width 128
--crop_height 128 --crop_width 64
--epochs 100 \

I found my code stuck at line 153 (main_dino. Py). Is this caused by loading the luperson dataset? It's been running for six hours.

Performance on Domain Generalization

Thanks for your works on ReID! I got several questions:

I see you've done UDA experiments between market and msmt in TransReID-SSL. What if just train the model on A person-dataset then test on B person-dataset? How's the performance?
What does 'patchify stem' mean? More specifically, what's the purpose of ICS component and where it works?

About how to process LUP dataset

Excellent work! And how you process dets.pkl of LUP dataset? It seems that the video extraction has resolution problem and we don't know the FPS. If you have done it, could you send me the script? Thanks a lot!

Pre-trained Models

Hi, can you provide pre-trained model weights of MoCo and MoBy on LUPerson dataset?

Data transforms

TransReID-SSL/transreid_pytorch/configs/market/vit_base_baseline.yml

Line 19 in fc39e88

PIXEL_MEAN: [0.5, 0.5, 0.5]

Hi,

I want to know why the normalization mean and std are all [0.5, 0.5, 0.5]?

Thanks very much!

I am using RTX 3090 GPU.

Can you please help me check if there is some issue?
Thanks!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.