Hi, I would like to reproduce the result of QDTrack+CAL in table 4 of the paper, which

Different components of The TETer on the TAO open set using our TETA metrics about tet HOT 3 CLOSED

gaoyuchris commented on June 12, 2024

Different components of The TETer on the TAO open set using our TETA metrics

from tet.

Comments (3)

siyuanliii commented on June 12, 2024

Hi, thanks for the question. Could you provide more details regards the training? e.g. what's the batch size? how many GPUs did you use? what're the detailed performance regarding detection and tracking parts?

from tet.

gaoyuchris commented on June 12, 2024

Hi, thanks for the question. Could you provide more details regards the training? e.g. what's the batch size? how many GPUs did you use? what're the detailed performance regarding detection and tracking parts?

@siyliepfl Thansk for your relply. I suspect there is a problem with my first step training. For I get the training results: bbox_AP: 0.0040, bbox_AP50: 0.0070, bbox_AP75: 0.0040.

I just use the rep(https://github.com/SysCV/qdtrack) and use the command:

 sh ./tools/dist_train.sh ./configs/tao/qdtrack_frcnn_r101_fpn_24e_lvis_1230_cls.py 8

That means that I used 8 gpus(v100) with batch size 16. And the config is shown as following

_base_ = '../_base_/qdtrack_faster_rcnn_r50_fpn.py'
model = dict(
    detector=dict(
        backbone=dict(
            depth=101,
            init_cfg=dict(type='Pretrained',
                        checkpoint='torchvision://resnet101')),
        roi_head=dict(bbox_head=dict(num_classes=1230)),
        test_cfg=dict(
        rcnn=dict(
            score_thr=0.0001,
            nms=dict(type='nms', iou_threshold=0.5),
            max_per_img=300))),
    tracker=dict(
        type='TaoTracker',
        init_score_thr=0.0001,
        obj_score_thr=0.0001,
        match_score_thr=0.5,
        memo_frames=10,
        momentum_embed=0.8,
        momentum_obj_score=0.5,
        obj_score_diff_thr=1.0,
        distractor_nms_thr=0.3,
        distractor_score_thr=0.5,
        match_metric='bisoftmax',
        match_with_cosine=True))
# dataset settings
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadMultiImagesFromFile'),
    dict(type='SeqLoadAnnotations', with_bbox=True, with_ins_id=True),
    dict(
        type='SeqResize',
        img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
                   (1333, 768), (1333, 800)],
        share_params=True,
        multiscale_mode='value',
        keep_ratio=True),
    dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5),
    dict(type='SeqNormalize', **img_norm_cfg),
    dict(type='SeqPad', size_divisor=32),
    dict(type='SeqDefaultFormatBundle'),
    dict(
        type='SeqCollect',
        keys=['img', 'gt_bboxes', 'gt_labels', 'gt_match_indices'],
        ref_prefix='ref'),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='VideoCollect', keys=['img'])
        ])
]
# dataset settings
dataset_type = 'TaoDataset'
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        _delete_=True,
        type='ClassBalancedDataset',
        oversample_thr=1e-3,
        dataset=dict(
            type=dataset_type,
            classes='data/lvis/annotations/lvis_classes.txt',
            load_as_video=False,
            ann_file='data/lvis/annotations/lvisv0.5+coco_train.json',
            img_prefix='data/lvis/train2017/',
            key_img_sampler=dict(interval=1),
            ref_img_sampler=dict(num_ref_imgs=1, scope=1, method='uniform'),
            pipeline=train_pipeline)),
    val=dict(
        type=dataset_type,
        classes='data/lvis/annotations/lvis_classes.txt',
        ann_file='data/tao/annotations/validation_ours.json',
        img_prefix='data/tao/frames/',
        ref_img_sampler=None,
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        classes='data/lvis/annotations/lvis_classes.txt',
        ann_file='data/tao/annotations/validation_ours.json',
        img_prefix='data/tao/frames/',
        ref_img_sampler=None,
        pipeline=test_pipeline))
# optimizer
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=1000,
    warmup_ratio=1.0 / 1000,
    step=[16, 22])
# checkpoint saving
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
# yapf:enable
# runtime settings
total_epochs = 24
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
evaluation = dict(metric=['bbox', 'track'], start=24, interval=2)
work_dir = './work_dirs/tao/qdtrack_frcnn_r101_fpn_24e_lvis_1230_cls'

from tet.

gaoyuchris commented on June 12, 2024

I find that it's the category_id setting bug causing the problem (https://github.com/SysCV/qdtrack/blob/c5b10472d7bdd3b9ab75255dd10e48e21f48c54f/qdtrack/datasets/tao_dataset.py#L131). Just be consistent with tet can reproduce the result of QDTrack+CAL in table 4 of the paper.

from tet.

Different components of The TETer on the TAO open set using our TETA metrics about tet HOT 3 CLOSED

Comments (3)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent