Coder Social home page Coder Social logo

udat's Introduction

Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022)

Junjie Ye, Changhong Fu, Guangze Zheng, Danda Pani Paudel, and Guang Chen. Unsupervised Domain Adaptation for Nighttime Aerial Tracking. In CVPR, pages 1-10, 2022.

featured

Overview

UDAT is an unsupervised domain adaptation framework for visual object tracking. This repo contains its Python implementation.

Paper | NAT2021 benchmark

Testing UDAT

1. Preprocessing

Before training, we need to preprocess the unlabelled training data to generate training pairs.

  1. Download the proposed NAT2021-train set

  2. Customize the directory of the train set in lowlight_enhancement.py and enhance the nighttime sequences

    cd preprocessing/
    python lowlight_enhancement.py # enhanced sequences will be saved at '/YOUR/PATH/NAT2021/train/data_seq_enhanced/'
  3. Download the video saliency detection model here and place it at preprocessing/models/checkpoints/.

  4. Predict salient objects and obtain candidate boxes

    python inference.py # candidate boxes will be saved at 'coarse_boxes/' as .npy
  5. Generate pseudo annotations from candidate boxes using dynamic programming

    python gen_seq_bboxes.py # pseudo box sequences will be saved at 'pseudo_anno/'
  6. Generate cropped training patches and a JSON file for training

    python par_crop.py
    python gen_json.py

2. Train

Take UDAT-CAR for instance.

  1. Apart from above target domain dataset NAT2021, you need to download and prepare source domain datasets VID and GOT-10K.

  2. Download the pre-trained daytime model (SiamCAR/SiamBAN) and place it at UDAT/tools/snapshot.

  3. Start training

    cd UDAT/CAR
    export PYTHONPATH=$PWD
    python tools/train.py

3. Test

Take UDAT-CAR for instance.

  1. For quick test, you can download our trained model for UDAT-CAR (or UDAT-BAN) and place it at UDAT/CAR/experiments/udatcar_r50_l234.

  2. Start testing

    python tools/test.py --dataset NAT

4. Eval

  1. Start evaluating
    python tools/eval.py --dataset NAT

Demo

Demo video

Reference

@Inproceedings{Ye2022CVPR,

title={{Unsupervised Domain Adaptation for Nighttime Aerial Tracking}},

author={Ye, Junjie and Fu, Changhong and Zheng, Guangze and Paudel, Danda Pani and Chen, Guang},

booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},

year={2022},

pages={1-10}

}

Acknowledgments

We sincerely thank the contribution of following repos: SiamCAR, SiamBAN, DCFNet, DCE, and USOT.

Contact

If you have any questions, please contact Junjie Ye at [email protected] or Changhong Fu at [email protected].

udat's People

Contributors

jay-ye avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

udat's Issues

question about transformer bridging layer

Hello, I'm very interested in your research. Because the datasets are too large, it is not easy to run through the code. I want to know the size of the feature extractor when it is passed into the transformer bridging layer. Does the feature size change after feature alignment?

难以复现您的结果

尝试复现您的结果,但未能得到满意的效果,不知是否是复现步骤的问题,如果可以的话,请提供详细的复现步骤。

About the implementation details of Figure 4 in the paper

Hi,I have some questions about this paper.
Figure 4 in the paper reports the t-SNE results of the features before and after the Bridging layer. I want to know whether the t-SNE input here is the feature of the domain transformer discriminator or the original feature directly output by the Bridging layer. If so, does the feature output by the Feature extractor before the Bridging layer need to train an additional domain discriminator.
Thank you!
image

About val dataset and the result of two baselines

Hi. I have two questions:

  1. It seems that there is no val dataset in NAT, so do you directly use the model saved in the last epoch to test?
  2. Since the train set of NAT does not have labels and the two baselines(SiamCAR & SiamBAN) are supervised methods, I wonder whether the results of two baselines used to compare with UDAT are only trained on GOT-10K and VID.

训练出来的模型

您好,感谢你的工作成果。请问训练出来的有两个模型,checkpoint.pth和d_checkpoint.pth,请问测试的时候是用哪个模型?d_checkpoint的作用是什么?谢谢!

Why the 'DDP' mode cannot work on this framework

Hi, I try to train UDAT with 4 2080 Ti. However, dp mode causes uneven distribution of GPU memory. For example, only the main GPU occupies 11G, while the remaining three occupy only 6G on average. I change it to DDP mode but it doesn't work. Would you have any ideas?

Besides, I find several bugs.

  1. in the train.py, when calculating the summation of discriminator output, the numpy function cannot operate on tensors with grad.
    i.e., train.py line 257-258, 285-286, 297-298
D_out_z = np.sum([Disc(F.softmax(_zf_up_t, dim=1)) for _zf_up_t in zf_up_t])/3.0
D_out_x = np.sum([Disc(F.softmax(_xf_up_t, dim=1)) for _xf_up_t in xf_up_t])/3.0

I think it should be

D_out_z = torch.stack([Disc(F.softmax(_zf_up_t, dim=1)) for _zf_up_t in zf_up_t]).sum(0) / 3.
D_out_x = torch.stack([Disc(F.softmax(_xf_up_t, dim=1)) for _xf_up_t in xf_up_t]).sum(0) / 3.
  1. in the eval.py, the line 67 elif 'NAT' in args.dataset: should be elif 'NAT' == args.dataset:. Otherwise, the results of NAT_L would also go into this branch.

关于图9绘制

作者,您好!
我想请教一下,论文中图9如何绘制

Training Strategy

Hello, I am interesting in the training strategy of your paper. Why you choose an alternatively training strategy to optimize G and D? It seems to be straightforward to train end-to-end, since you use a RevGrad layer.

The code of t-SNE in Figure 4

Dear author, thank you very much for your work.
I wonder if the t-SNE code used in Figure 4 will be made public?
This visualization method is very interesting!
Thanks in advance!

t-SNE for Unsupervised Domain Adaptation for Nighttime Aerial Tracking

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.