hnuzhy / ssda-yolo Goto Github PK

View Code? Open in Web Editor NEW

130.0 1.0 16.0 12.23 MB

Codes for my paper "SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection"

Python 98.67% Shell 1.10% Dockerfile 0.23%

domain-adaptation object-detection knowledge-distillation transfer-learning image-style-transfer yolov5

ssda-yolo's Introduction

Hi! Dear Visitor. 😃 I'm Huayi Zhou, a PhD student in Shanghai Jiao Tong University.

❤️ I'm doing my research of Computer Vision, Pose Estimation, Transfer Learning and Digital Education. See CV_DL_Gather
🚀 I'm exploring the practical and landable applications of advanced AI algorithms in the traditional classroom. See StuArt
⭐ I'm a faithful follower of YOLO series algorithms for their simple yet efficient design. See SSDA-YOLO, DirectMHP and BPJDet
👍 I'm recently focusing on the semi-supervised learning (SSL) for its data/label efficient feature. See MultiAugs, SemiUHPE

ssda-yolo's People

Contributors

Stargazers

Watchers

Forkers

jie311 cv-det lonelyzyp ezkernel yanqiwan wulixianming zilipeng erikvalle wakme helloszs achintski pudongdong maple1024 coisini-tinkle najingligong1111 wenkun2001

ssda-yolo's Issues

请教一下，关于对yaml文件中五个数据集设置的疑惑

假如我想自定义数据集训练一个检测雾中船舶的模型，现有数据集a和b，则train_source_real中代表的是不是带有标签的真实图像的船舶数据集a；train_source_fake中代表的是不是带有标签的经过雾化处理的真实的船舶数据集a；train_target_real中代表的是不是没有标签的经过雾化处理的真实的船舶数据集b；train_target_fake中代表的的是不是没有标签的真实的船舶数据集b；test_target_real中则用一些雾化的数据集用于验证呢？望解答一下！！

How to set the crop size when using CUT to generate fake images, 256 * 256?

Hello sir,
When you use CUT model to generate fake images on cityscapes and foggy datasets, how to set the crop(input) size, 256256? or 960960? Could you give us more details about the picture quality of generated fake images? Thank you very much!

Can you help me with this problem?

when I trained the ssda_yolov5, the error was as follows:
File "ssda_yolov5_train.py", line 548, in train
loss_items[3] += loss_cons.detach()[0] # (lbox, lobj, lcls, total_loss)
IndexError: index 3 is out of bounds for dimension 0 with size 3

And I print the shape of the 'loss_item', which is a tensor size of 3.

Invalid results when validating teacher model

Hi, I tried testing teacher model weights on voc->clipart dataset, however I obtained invalid results. Student model, on the other hand, performs as expected, althought overall performance is worse than in the paper

I tried outputing model predictions on images, and here's what I have got. These numbers are realy large, but I can't realy tell why do they appear.
test_target_real: Scanning '..\datasets\clipart\yolov5_format\labels\val.cache' images and labels... 500 found, 0 missing, 0 empty, 0 Class Images Labels P R [email protected] [email protected] [email protected]:.95: 0%| | 0/250 [00:00<?, ?it/s][tensor([[ -8., -4., 32., -4., 1., 3.], [ -8., -4., 32., -4., 1., 5.], [ -8., -4., 32., -4., 1., 6.], ..., [144., 4., 184., 4., 1., 5.], [144., 4., 184., 4., 1., 6.], [152., 4., 192., 4., 1., 5.]], device='cuda:0'), tensor([[ -8., -4., 32., -4., 1., 5.], [ -8., -4., 32., -4., 1., 6.], [ 0., 12., 40., 12., 1., 5.], ..., [184., 4., 224., 4., 1., 6.], [192., 4., 232., 4., 1., 5.], [192., 4., 232., 4., 1., 6.]], device='cuda:0')] Class Images Labels P R [email protected] [email protected] [email protected]:.95: 0%| | 1/250 [00:00<02:48, 1.47it[tensor([[ -8., 12., 32., 12., 1., 5.], [ -8., 12., 32., 12., 1., 6.], [ 0., 12., 40., 12., 1., 5.], ..., [384., 4., 424., 4., 1., 5.], [384., 4., 424., 4., 1., 6.], [408., 4., 448., 4., 1., 5.]], device='cuda:0'), tensor([[ -8.40268, -30.00000, 31.59732, 22.00000, 1.00000, 5.00000], [ -8.40268, -30.00000, 31.59732, 22.00000, 1.00000, 6.00000], [ -8.40268, -30.00000, 31.59732, 22.00000, 1.00000, 14.00000], ..., [536.00000, 4.00000, 576.00000, 4.00000, 1.00000, 6.00000], [544.00000, 4.00000, 584.00000, 4.00000, 1.00000, 5.00000], [544.00000, 4.00000, 584.00000, 4.00000, 1.00000, 6.00000]], device='cuda:0')] Class Images Labels P R [email protected] [email protected] [email protected]:.95: 1%| | 2/250 [00:00<01:32, 2.68it[tensor([[ -8., -30., 32., 22., 1., 14.], [ 8., -30., 48., 22., 1., 5.], [ 8., -30., 48., 22., 1., 6.], ..., [352., 4., 392., 4., 1., 5.], [352., 4., 392., 4., 1., 6.], [360., 4., 400., 4., 1., 5.]], device='cuda:0'), tensor([[ -8., -30., 32., 22., 1., 5.], [ -8., -30., 32., 22., 1., 6.], [ 0., -4., 40., -4., 1., 5.], ..., [184., 4., 224., 4., 1., 5.], [184., 4., 224., 4., 1., 6.], [192., 4., 232., 4., 1., 5.]], device='cuda:0')] Class Images Labels P R [email protected] [email protected] [email protected]:.95: 1%| | 3/250 [00:01<01:16, 3.22it[tensor([[ -8., -4., 32., -4., 1., 5.], [ -8., -4., 32., -4., 1., 6.], [ 0., 12., 40., 12., 1., 5.], ..., [208., 4., 248., 4., 1., 5.], [208., 4., 248., 4., 1., 6.], [216., 4., 256., 4., 1., 5.]], device='cuda:0'), tensor([[ -8., -4., 32., -4., 1., 5.], [ -8., -4., 32., -4., 1., 6.], [ 0., 12., 40., 12., 1., 5.], ..., [232., 4., 272., 4., 1., 5.], [232., 4., 272., 4., 1., 6.], [240., 4., 280., 4., 1., 5.]], device='cuda:0')]

why my test mAP is only 0.207？

(ssdayolo) liuhaolin@ubuntu18:/sdb/liuhaolin/SSDA-YOLO$python -m torch.distributed.launch --nproc_per_node 2 ssda_yolov5_train.py --weights weights/yolov5l.pt --data yamls_sda/pascalvoc0712_clipart1k_VOC.yaml --name voc2clipart_ssda_960_yolov5l --img 960 --device 0,1 --batch-size 12 --epochs 200 --lambda_weight 0.005 --consistency_loss --alpha_weight 2.0

Epoch gpu_mem 199/199 22.3G Class Images Labels all 500 1526 aeroplane 500 bicycle 500 bird 500 boat 500 bottle 500 bus 500 car 500 cat 500 chair 500 cow 500 diningtable 500 dog 500 horse 500 motorbike 500 person 500 pottedplant 500 sheep 500 sofa 500 train 500 tvmonitor 500 200 epochs completed in 42.124 hours. box obj cls total cons labels img_size
0.07241 0.07605 0.03346 0.2014 0.01949 30 960: 100%|█████████████████████████████████████████████████| 711/711 [14:08<00:00, 1.19s/it]
P R [email protected] [email protected] [email protected]:.95: 100%|████████████████████████████████████████████| 42/42 [00:06<00:00, 6.50it/s]
0.61 0.338 0.396 0.208 0.207
41 0.444 0.195 0.24 0.138 0.123
16 0.839 0.654 0.69 0.392 0.344
124 0.841 0.218 0.275 0.119 0.146
74 0.929 0.354 0.48 0.219 0.22
74 0.53 0.649 0.592 0.0899 0.214
8 0.473 0.375 0.34 0.288 0.204
84 0.675 0.297 0.385 0.266 0.234
23 0.148 0.087 0.0557 0.024 0.0248
163 0.613 0.583 0.573 0.324 0.324
21 0.516 0.429 0.461 0.444 0.31
50 0.546 0.2 0.283 0.111 0.118
24 0.288 0.25 0.15 0.0882 0.0844
34 0.696 0.118 0.332 0.223 0.184
10 0.773 0.2 0.338 0.248 0.241
566 0.798 0.398 0.562 0.21 0.264
94 0.667 0.468 0.559 0.357 0.32
33 0.535 0.212 0.221 0.0519 0.088
21 0.639 0.333 0.407 0.313 0.282
26 0.694 0.176 0.419 0.0956 0.156
40 0.549 0.575 0.553 0.161 0.255

Optimizer stripped from runs/train/voc2clipart_ssda_960_yolov5l3/weights/last_student.pt, 94.0MB
Optimizer stripped from runs/train/voc2clipart_ssda_960_yolov5l3/weights/best_student.pt, 94.0MB
Destroying process group... Done.

visual detection example generation

How is the visual detection example of SSDA-YOLO applied to foggy image generation in this paper generated?

How to trian on my custom dataset?

Hello author, i want to trian on my custom dataset, what should i do?

The width and height of the real and fake images are inconsistent

img, _, (h, w) = load_image(self, index)
img_fake, _, (h, w) = load_image(self, index, type="fake")

i only have two gpu， i set device 0,1 when train the model，i face the error that is insufficient CUDA devices for DDP command， how to solve it

(base) liuhaolin@ubuntu18:/sdb/liuhaolin/SSDA-YOLO$ python -m torch.distributed.launch --nproc_per_node 4 ssda_yolov5_train.py --weights weights/yolov5l.pt --data yamls_sda/pascalvoc0712_clipart1k_VOC.yaml --name voc2clipart_ssda_960_yolov5l --img 960 --device 0,1 --batch-size 24 --epochs 200 --lambda_weight 0.005 --consistency_loss --alpha_weight 2.0 /sdb/anaconda/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
warnings.warn(
WARNING:torch.distributed.run:

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

LOCAL_RANK 1
cuda 2 1
LOCAL_RANK 2
cuda 2 2
Traceback (most recent call last):
File "ssda_yolov5_train.py", line 833, in
main(opt)
File "ssda_yolov5_train.py", line 816, in main
assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command'
AssertionError: insufficient CUDA devices for DDP command
LOCAL_RANK 3
cuda 2 3
Traceback (most recent call last):
File "ssda_yolov5_train.py", line 833, in
main(opt)
File "ssda_yolov5_train.py", line 816, in main
assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command'
AssertionError: insufficient CUDA devices for DDP command
LOCAL_RANK 0
train: weights=weights/yolov5l.pt, cfg=, data=yamls_sda/pascalvoc0712_clipart1k_VOC.yaml, hyp=data/hyps/hyp.scratch.yaml, epochs=200, batch_size=24, img_size=[960], rect=False, resume=False, nosave=False, notest=False, noautoanchor=False, evolve=False, bucket=, cache_images=False, image_weights=False, device=0,1, multi_scale=False, single_cls=False, adam=False, sync_bn=False, workers=8, project=runs/train, entity=None, name=voc2clipart_ssda_960_yolov5l, exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, upload_dataset=False, bbox_interval=-1, save_period=-1, artifact_alias=latest, local_rank=0, teacher_alpha=0.99, conf_thres=0.5, iou_thres=0.3, max_gt_boxes=20, lambda_weight=0.005, consistency_loss=True, alpha_weight=2.0, student_weight=None, teacher_weight=None, save_dir=None
github: ⚠️ WARNING: code is out of date by 1 commit. Use 'git pull' to update or 'git clone https://github.com/hnuzhy/SSDA-YOLO' to download latest.
YOLOv5 🚀 57d8bc3 torch 1.12.1+cu102 CUDA:0 (Tesla V100S-PCIE-32GB, 32510.5MB)
CUDA:1 (Tesla V100S-PCIE-32GB, 32510.5MB)
Added key: store_based_barrier_key:1 to store for rank: 0
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 79246 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 79247 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 2 (pid: 79248) of binary: /sdb/anaconda/bin/python
Traceback (most recent call last):
File "/sdb/anaconda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/sdb/anaconda/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/sdb/anaconda/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/sdb/anaconda/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/sdb/anaconda/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch

File "/sdb/anaconda/lib/python3.8/site-packages/torch/distribute============================================================
ssda_yolov5_train.py FAILED

Failures:
[1]:
time : 2022-11-29_12:17:00
host : ubuntu18
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 79249)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2022-11-29_12:17:00
host : ubuntu18
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 79248)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
d/run.py", line 752, in run
elastic_launch(
File "/sdb/anaconda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/sdb/anaconda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

Bug in teacher weights calculation

Hi, your code implementation contains error in updating teacher weights.
Basically, current code implementation for WeightEMA at utils/torch_utils.py breaks statistics, which are saved in model for batch norm, if I recall correctly. Because of this, teacher model after about 5th epoch begins making invalid predictions, which further hurts training and it becomes a little worse, than without using teacher model at all.

To fix this, you can refer to original yolov5 EMA ModelEMA in the same file just above. I was able to rewrite the code that way and then reproduce results, which are close to the ones in the paper even at image size equal to 640.

My old issue #18 was from this bug, and currently not closed #6 faced that issue too.

I can make a pull request later in the next week, if you wish.

How to load an unlabeled dataset

I would like to inquire about the problem. Why is there an error that the target data set does not have labels during training?

Labes problem

Hi, I have a question about tags while reproducing your paper. According to the description in your dataset configuration file, train_source_real, train_source_fake and test_traget_real require labels, and train_target_real and train_target_fake do not require labels. But in my actual training, I found that train_source_fake does not need labels, and train_target_real needs labels. From this point of view, it requires all the labels of the two data sets, which is not the so-called semi-supervised training. Not sure if I made a mistake, hope you have time to help me out.

training time

How long does it take to learn?

I would also like to know the number of gpu's and batch size.

Generated CUT Images for Cityscapes -> Foggy

Hi, nice work
Will you upload your images generated by CUT model for Cityscapers to Foggy Cityscapes style and vice versa?
Thanks

RuntimeError: result type Float can't be cast to the desired output type long int

hi
I use my custom data for train and get the following error during training:
"RuntimeError: result type Float can't be cast to the desired output type long int"
please guide me
Thank

how to use the training checkpoint to predict image

How can I use a trained model to predict images and generate their corresponding text?

ModuleNotFoundError:No module named 'umt_yolov5_test'

YOLOv7

Can i use yolov7 here?

How to keep the object label position invariance in a specific dataset such as the Yawn dataset?

Hi Author, I have a curious question to ask how to solve the problem of object label position offset ?

In the paper conclusion, "we apply a consistency loss function to correct the prediction shifts of images from different domains but with same labels." Similarly, In Fig.1, the yellow and gray branches need to keep the two images(source domain and target-like souce domain) with the same object label. It is obvious that this behavior can be maintained in VOC and Cityscapes task.
However, for example, whether the angle transformation of the camera will cause the object label position to shift or genearate new object without object label in the fake image generated by CUT. In the same situation, in yawnling data set, there will also be differences in camer angle and different positions of tables, chairs and characters.
Thanks a lot for taking the time to review this question.

hnuzhy / ssda-yolo Goto Github PK

ssda-yolo's Introduction

Hi! Dear Visitor. 😃 I'm Huayi Zhou, a PhD student in Shanghai Jiao Tong University.

ssda-yolo's People

Contributors

Stargazers

Watchers

Forkers

ssda-yolo's Issues

File "/sdb/anaconda/lib/python3.8/site-packages/torch/distribute============================================================ ssda_yolov5_train.py FAILED

Failures: [1]: time : 2022-11-29_12:17:00 host : ubuntu18 rank : 3 (local_rank: 3) exitcode : 1 (pid: 79249) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Recommend Projects

Recommend Topics

Recommend Org

File "/sdb/anaconda/lib/python3.8/site-packages/torch/distribute============================================================
ssda_yolov5_train.py FAILED

Failures:
[1]:
time : 2022-11-29_12:17:00
host : ubuntu18
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 79249)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html