bobo-y / flexible-yolov5 Goto Github PK

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam，dcn and so on), and tensorrt

License: GNU General Public License v3.0

Python 95.86% Dockerfile 0.26% CMake 0.18% C++ 3.69%

yolov5 resnet moblienet backbone neck cbam pytorch shufflenet hrnet dcnv2

flexible-yolov5's Introduction

flexible-yolov5

Update the code for ultralytics/yolov5 version 6.1.

代码基于U版YOLOv5 6.1版本. 根据 {backbone, neck, head} 重新组织了网络结构, 目前backbone 除了原始的YOLO外，还可选择 resnet, hrnet, swin-transformer, gnn, mobilenet 等主流backbone. 同时也可以自由的加入 SE, DCN, drop block 等插件. 可以很方便的对网络结构等进行替换、修改、实验. 同时提供了tensorrt 的c++、Python 推理, 量化. 以及Triton、tf_serving 部署代码. 每个backbone只选了一个训练300个epoch做对比，均无预训练权重，由于网络结构不同，我的结果并不能代表网络最终的结果，可以作为一个baseline参考. 这个项目适合想要各种改YOLO或者验证模块. 是如果你有什么好的idea，比如增加新的backbone, 插件等, 欢迎提PR, 使用时遇到什么问题, 也欢迎提issue. 如果对你有帮助, 感谢给颗♥(ˆ◡ˆԅ)小 ⭐️⭐️.

Split the yolov5 model to {backbone, neck, head} to facilitate the operation of various modules and support more backbones.Basically, only change the model, and I didn't change the architecture, training and testing of yolov5. Therefore, if the original code is updated, it is also very convenient to update this code. if you have some new ideas, you can give a pull request, add new features together。 if this repo can help you, please give me a star.

Features
Notices
Bugs
Prerequisites
Getting Started
Reference

Features

support nvidia tensor-core 4:2 sparsity.
support QAT, for qat onnx export, need torch >=1.13(I only test on this version) 2023-10-17
update PTQ code , 2023-10-01
Reorganize model structure, such as backbone, neck, head, can modify the network flexibly and conveniently
mobilenetV3-small, mobilenetV3-large
shufflenet_v2_x0_5, shufflenet_v2_x1_0, shufflenet_v2_x1_5, shufflenet_v2_x2_0
yolov5s, yolov5m, yolov5l, yolov5x, yolov5transformer
resnet18, resnet50, resnet34, resnet101, resnet152
efficientnet_b0 - efficientnet_b8, efficientnet_l2
hrnet 18,32,48
CBAM, SE
swin transformer - base, tiny, small, large (please set half=False in scripts/eval.py and don't use model.half in train.py)
DCN (mixed precision training not support, if you want use dcn, please close amp in line 292 of scripts/train.py)
coord conv
drop_block
vgg, repvgg
tensorrt c++/python infer, triton server infer
gnn backbone

Notices

The CBAM, SE, DCN, coord conv. At present, the above plug-ins are not added to all networks, so you may need to modify the code yourself.
The default gw and gd for PAN and FPN of other backbone are same as yolov5_s, so if you want a strong model, please modify self.gw and self.gd in FPN and PAN.
resnet with dcn, training on gpu *RuntimeError: expected scalar type Half but found Float: please remove the mixed precision training in line 351 of scripts/train.py
swin-transformer, training is ok, but testing report *RuntimeError: expected object of scalar type Float but got scalar type Half for argument #2 'mat2' in call to_th_bmm_out in swin_trsansformer.py. please set half=False in script/eval.py
mobilenet export onnx failed, please replace HardSigmoid() by others, because onnx don't support pytorch nn.threshold

Bugs

None

Prerequisites

please refer requirements.txt

Getting Started

Dataset Preparation

Make data for yolov5 format. you can use od/data/transform_voc.py convert VOC data to yolov5 data format.

Training and Testing

For training and Testing, it's same like yolov5.

Training

check out configs/data.yaml, and replace with your data， and number of object nc
check out configs/model_*.yaml, choose backbone. and change nc to your dataset. please refer support_backbone in models.backbone.init.py

$ python scripts/train.py  --batch 16 --epochs 5 --data configs/data.yaml --cfg configs/model_XXX.yaml

# for nvidia tensor-core 4:2 sparsity, install apex

git clone https://github.com/NVIDIA/apex
cd apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

A google colab demo in train_demo.ipynb

Testing and Visualize

$ python scripts/eval.py   --data configs/data.yaml  --weights runs/train/yolo/weights/best.py

Model performance comparison with different backbone

For some reasons, I can't provide the pretrained weight, only the comparison results. Sorry!

All checkpoints are trained to 300 epochs with default settings, all backbones without pretrained weights. Yolov5 Nano and Small models use hyp.scratch-low.yaml hyps, all others use hyp.scratch-high.yaml. The mAP of the validation come to the last epoch, maybe not the best.

flexible-yolov5 model with different backbones	size ^(pixels)	mAP^val 0.5:0.95	mAP^val 0.5	params
[flexible-YOLOv5n](https://pan.baidu.com/s/1UAvEmgWmpxA3oPm5CJ8C-g 提取码: kg22)	640	25.7	43.3	1872157
[flexible-YOLOv5s](https://pan.baidu.com/s/1ImN2ryMK3IPy8_St-Rzxhw 提取码: pt8i)	640	35	54.7	7235389
[flexible-YOLOv5m]	640	42.1	62	21190557
[flexible-YOLOv5l]	640	45.3	65.3	46563709
[flexible-YOLOv5x]	640	47	66.7	86749405
others backbone
[mobilnet-v3-small]	640	21.9	37.6	3185757
[resnet-18]	640	34.6	53.7	14240445
[shufflenetv2-x1_0]	640	27.8	45.1	4297569
[repvgg-A0]	640
[vgg-16bn]	640	35.2	56.4	17868989
[efficientnet-b1]	640	38.1	58.6	9725597
[swin-tiny]	640	39.2	60.5	30691127
[gcn-tiny]	640	33.8	55.5	131474444
resnet with plug-in
[resnet-18-cbam]	640	35.2	55.5	15620399
[resnet-18-dcn]	640

Detection

python scripts/detector.py   --weights yolov5.pth --imgs_root  test_imgs   --save_dir  ./results --img_size  640  --conf_thresh 0.4  --iou_thresh 0.4

Deploy

Export

python scripts/export.py   --weights yolov5.pth

Grpc Server

In projects folder, tf_serving and triton demo are provided.

Quantization

You can directly quantify the onnx model

python scripts/trt_quant/generate_int8_engine.py --onnx path --images-dir  img_path  --save-engine  engine_path

See

Tensorrt Inference

For tensorrt model, you can direct use official trt export, and refer scripts/trt_infer/cpp/. For test, I use TensorRT-8.4.0.6.

privode c++ / python demo, scripts/trt_infer

Reference

↳ Contributors

↳ Stargazers

↳ Forkers

flexible-yolov5's People

Contributors

Stargazers

Watchers

Forkers

jessejx kapitsa2811 whuyyc ruofei7 zpadger xuekunnan chenhaohan88 smartyoung lijuny shining-love zjhellofss sean-wade nobody-cheng hdjsjyl collector-m codingtnl caojinpei liuyundong-2020 yossi-git projektosmium hotwordland kimwoonggon yongjingli yueyang07 hwijune yidan-zhang luoyizhi516 voscar-zhang shiyuan0806 narain1 hukai97 zhongzhenluo yanxioa scott-mao louderthanthunderx1 zxm111222333 zoey-smile dljzx fateeeeee wch243294382 myasser63 songyangsun wwjwy xinsuinizhuan mljack 1213824512 lsxzhq boorym wykhan servusoft mohsinkhn wanghe1997 xinxinatg shizhanhao a464395338 michael-yyang irvingao yanggui19891007 fardman69420 egrass hoangkhoile danchaofan-git aitangbodan python-repository-hub sarapieri jmalloch neophack henrywu2019 shuvo001 dreamgang aad211 bingxinhu bmithbs 18gzh18 maze-kyungmin kealthaswz marco-nguyen qianmin dwm059 seermer q83152165 jie311 lifeng0718 chenshuai-ca noobgrow huster-hq liuzhongyyi hanzhaolong zhang291 megleo outsider7777 sareerulamin wangrongsheng zhangsiqigit kuazhangxiaoai yuzhijia1997 dhrubapuc23 loweasy cv-det ccly1996

flexible-yolov5's Issues

关于swin

关于swin_transformer测试部分的报错解决

bs, _, ny, nx = x[i].shape #x(bs, 255, 20,20) to x(bs,3,20,20, 85) How is x transformed from 4-dim to 5-dim?

in od/models/head/yolo.py /

def forward(self, x):
    z = [] # inference output
    self.training |= self.export
    for i in range(self.nl):
        x[i] = self.m[i](x[i])
        bs, _, ny, nx = x[i].shape  #x(bs, 255, 20,20) to x(bs,3,20,20, 85)
        x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

ModuleNotFoundError: No module named 'od'

您好，我在调试您的代码的时候出现了ModuleNotFoundError: No module named 'od'的问题。请问怎么解决？

Hi, it's a great work!

you are not a cainiao, you are a dalao, hhhhh. I'm the most vegeteble one. :(

RepVGG模型加载

您好，我尝试把RepVGG的backbone加到YOLOv5中，但现在报错信息如下：
Traceback (most recent call last):
File "/home/xxx/Desktop/flexible-yolov5/scripts/train.py", line 531, in
train(hyp, opt, device, tb_writer)
File "/home/xxx/Desktop/flexible-yolov5/scripts/train.py", line 91, in train
model = Model(opt.cfg).to(device) # create
File "/home/xxx/Desktop/flexible-yolov5/od/models/model.py", line 26, in init
backbone_out = self.backbone.out_shape
File "/home/xxx/Softwares/anaconda/envs/welding/lib/python3.7/site-packages/torch/nn/modules/module.py", line 948, in getattr
type(self).name, name))
AttributeError: 'RepVGG' object has no attribute 'out_shape'
您知道应该怎么修改模型吗？

detector.py应该如何使用？

您好，
我使用下面的代码来加载训练好的模型：

from detector import *

pt_path = 'E:\\flexible-yolov5\\scripts\\best.pt'
img = 'E:\\flexible-yolov5\\source\\00-20.3592.jpg'
class_pth = 'E:\\flexible-yolov5\\source\\classes.txt'

inter = Detector(pt_path=pt_path, namesfile=class_pth, img_size=576,classes=6)
result = inter(img)
print(result)

但是出现报错：

Traceback (most recent call last):
  File "E:/flexible-yolov5/scripts/detector.py", line 131, in <module>
    inter = Detector(pt_path=pt_path, namesfile=class_pth, img_size=576,classes=6)
  File "E:/flexible-yolov5/scripts/detector.py", line 16, in __init__
    self.model = self.load_model()
  File "E:/flexible-yolov5/scripts/detector.py", line 25, in load_model
    model = attempt_load(self.pt_path, map_location='cpu')  # load FP32 model
  File "E:\flexible-yolov5\od\models\modules\experimental.py", line 118, in attempt_load
    model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval())  # load FP32 model
  File "E:\flexible-yolov5\venv\lib\site-packages\torch\serialization.py", line 592, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "E:\flexible-yolov5\venv\lib\site-packages\torch\serialization.py", line 851, in _load
    result = unpickler.load()
ModuleNotFoundError: No module named 'models'

我应该如何解决这个问题呢？谢谢。

how to show the name of class in labels.jpg?

As the image shows, the upper-left bar chart, wanna calss name other than ' 0 ,2, 4, 6 ' in x axis. How?

versions of python, Python and CUDA

I would like to ask the following questions: what are your versions of python, Python and CUDA

What's depth gain ? width gain? Can anyone tell why the values in gains are .0.33, 0.5,,,,1.33, 1.25 ?

class YOLOv5(nn.Module):
def init(self, focus=True, version='L'):
super(YOLOv5, self).init()
self.version = version
self.with_focus = focus

    gains = {'s': {'gd': 0.33, 'gw': 0.5},
             'm': {'gd': 0.67, 'gw': 0.75},
             'l': {'gd': 1, 'gw': 1},
             'x': {'gd': 1.33, 'gw': 1.25}}
    self.gd = gains[self.version.lower()]['gd']  # depth gain
    self.gw = gains[self.version.lower()]['gw']  # width gain

what's # depth gain , # width gain ?

AttributeError: 'PosixPath' object has no attribute 'tell'

Hello
In the process of reproducing your code, I encountered the following problem：
Traceback (most recent call last):
File "scripts/train.py", line 527, in
train(hyp, opt, device, tb_writer, wandb)
File "scripts/train.py", line 189, in train
image_weights=opt.image_weights, quad=opt.quad, prefix=colorstr('train: '))
File "/home/lab/liyuping/flexible-yolov5-main/flexible-yolov5-main/od/data/datasets.py", line 71, in create_dataloader
prefix=prefix)
File "/home/lab/liyuping/flexible-yolov5-main/flexible-yolov5-main/od/data/datasets.py", line 377, in init
cache, exists = torch.load(cache_path), True # load
File "/home/lab/anaconda3/envs/yolo/lib/python3.7/site-packages/torch/serialization.py", line 527, in load
with _open_zipfile_reader(f) as opened_zipfile:
File "/home/lab/anaconda3/envs/yolo/lib/python3.7/site-packages/torch/serialization.py", line 224, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
AttributeError: 'PosixPath' object has no attribute 'tell'

Do you know how to solve it？Look forward to your reply.
Thanks

How can I get a PR_curve.png with each class name ? (now 1 curve for mAP, can't view AP for each class from PR_curve)

pre training model

请问有训练好的预训练模型吗

Why the mAP(0.06) so low? Same dataset, same batch-size, when trained with ultralytics/yolov5, mAP is 0.90 within less than 200 epochs.

Any interests in applying transformer in this repo ?

Glad to see this repo has 15 stars now, I guess it will enjoy a boom in star numbers if SOTA transformer added . :)

eval.py的使用问题

您好，我在eval.py中使用了--task speed参数，出现了如下报错：

  File "eval.py", line 325, in <module>
    test(opt.data, w, opt.batch_size, opt.img_size, 0.25, 0.45, save_json=False, plots=False)
  File "eval.py", line 56, in test
    model = attempt_load(weights, map_location=device)  # load FP32 model
  File "/home/xxx/Desktop/flexible-yolov5/od/models/modules/experimental.py", line 118, in attempt_load
    model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval())  # load FP32 model
  File "/home/xxx/Softwares/anaconda/envs/welding/lib/python3.8/site-packages/torch/serialization.py", line 579, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/home/xxx/Softwares/anaconda/envs/welding/lib/python3.8/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/home/xxx/Softwares/anaconda/envs/welding/lib/python3.8/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
IsADirectoryError: [Errno 21] Is a directory: '/'

val和test参数是没有问题的，请问这是什么原因造成的？

What's anchors/target mean ? I got anchors/target = 3.83

Analyzing anchors... anchors/target = 3.83, Best Possible Recall (BPR) = 0.9878
Image sizes 640 train, 640 test
Using 8 dataloader workers
Logging results to runs/train/exp7

Comparing speed of MobileNet and Yolov5_small

Hello
Thanks for sharing your great work.
I compared the speed of Mobilnet_small with YoloV5_small. I noticed that MobileNet is considerably slower than YoloV5_Small.
Is it normal? I was expecting to obtain a higher speed out of MobileNet in comparison to YoloV5
Thanks

Why do we need to remove batchnorm after fuse?

def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
print('Fusing layers... ')
for module in [self.backbone, self.fpn, self.pan, self.detection]:
for m in module.modules():
if type(m) is Conv and hasattr(m, 'bn'):
m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
delattr(m, 'bn') # remove batchnorm
m.forward = m.fuseforward # update forward
self.info()
return self

delattr(m, 'bn') # remove batchnorm can you please help explain why it's necessary to remove batchnorm here?

Thanks for your great job!

模型保存问题

hi，训练好的模型保存下来的用netron软件看不到网络结构，这个好像只保存了参数

修改了代码按照原版的保存模型，结果还是一样，请问是我哪里错了吗

the yam file in od/model/hub

the yam file in od/model/hub can't use ?

RuntimeError: expected scalar type Half but found Float

Thanks for your great work. I met this problem when I use resnet as backbone.

Traceback (most recent call last):
File "train.py", line 527, in
train(hyp, opt, device, tb_writer, wandb)
File "train.py", line 293, in train
pred = model(imgs) # forward
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/Workings/flexible-yolov5-main/od/models/model.py", line 67, in forward
out = self.backbone(x)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/Workings/flexible-yolov5-main/od/models/backbone/resnet.py", line 208, in forward
x2 = self.layer2(x1)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/Workings/flexible-yolov5-main/od/models/backbone/resnet.py", line 119, in forward
out = self.conv2(out, offset)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py", line 143, in forward
return deform_conv2d(input, offset, self.weight, self.bias, stride=self.stride,
File "/home/jx/anaconda3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py", line 76, in deform_conv2d
return torch.ops.torchvision.deform_conv2d(
RuntimeError: expected scalar type Half but found Float

backbone使用yolov5-s，neck使用PAN报错

PAN的参数和FPN的参数不一致导致

P3_size  P4_size P5_size
C3_size C4_size C5_size

Any interests in exploring methods to deal with Class Imbalanced (Long-tail) detection?

This CVPR 2020 paper claimed that Balanced Group Softmax help address class-imbalance issue , I am not sure if the methods in this paper can be combined with yolov5 ( like replaced the heads )

https://openaccess.thecvf.com/content_CVPR_2020/papers/Li_Overcoming_Classifier_Imbalance_for_Long-Tail_Object_Detection_With_Balanced_Group_CVPR_2020_paper.pdf

How to infer a folder of images?

Your code is a great job and very rewarding.
But again, I see some differences between your code and authentic yolov5, and I would like to know how to do inference of a batch file (generate inference boxes, inference categories and inference values)?
I see that the original yolov5 has a detect.py file, but your codebase does not have it inside.

Can you please explain backbone, neck ,head in your repo here ?

is it support YOLOv4?

swin_transformer训练时如何关闭半精度呢

我还是不太明白怎样在程序中关闭半精度呢，怎样才能在train.py文件中不使用半精度呢

v5s模型占用1G的显存，占用内存大

v5s模型占用1G的显存，模型大小只有14M大小，占用内存要3G

raise Exception('Dataset not found.')

I put custom data as described in ultralytics/yolov5. When I tried to run your train.py , got error messages as below:

WARNING: Dataset not found, nonexistent paths: [/home/user/flexible-yolov5-main/dataset/person_count_val/images/train']
Traceback (most recent call last):
File "train.py", line 527, in
train(hyp, opt, device, tb_writer, wandb)
File "train.py", line 65, in train
check_dataset(data_dict) # check
File "/home/flexible-yolov5-main/utils/general.py", line 126, in check_dataset
raise Exception('Dataset not found.')
Exception: Dataset not found.

Then I check train.py line 65 :
train_path = data_dict['train']
test_path = data_dict['test']

How can I modify values in data_dict so that my custom data can be accessed by train.py ?

Images sizes do not match. This will causes images to be display incorrectly in the UI

首先膜拜下大佬：
目前我用swin transformer的backbone训练了1个epoch，报了Images sizes do not match. This will causes images to be display incorrectly in the UI 这个问题，看报错是transforer的py里面尺寸的问题，请教下有没有碰到过这个问题？

加载YOLOv5预训练模型失败

提示 ModuleNotFoundError: No module named 'models'

预训练权重

训练有预训练权重吗？还是说训练的时候随机初始化？

转trt失败

转trt的时候报错ERROR: INVALID_ARGUMENT: getPluginCreator could not find plugin ScatterND version 1
Completed parsing of ONNX file，麻烦问下您遇到过这个问题么

triton部署问题

这个部署的说明文档要怎么来弄

$ python  export.py
$ cp best.onnx projects/triton_server_deploy/models/yolov5s/1/
$ cd projects/triton_server_deploy/ 
$ docker build . -t "head:v1"
$ docker run -itd --gpus '"device=1"' -p 8080:8080 -p 8006:8006 -p 8081:8081 -p 8082:8082 -p 7070:7070 -p 7071:7071 --name head -v /data/share/imageAlgorithm/zhangcheng/2021/flexible-yolov5/projects:/data/share/imageAlgorithm/zhangcheng/2021/flexible-yolov5/projects bb3bfdeccc2f /bin/bash
# 新建的head容器一直无法开启
docker start head

请问我应该按照哪一个文档来建立triton_server
谢谢

I got many errors and still not able to train any model. It would be very helpful if you provide an example notebook for training model.

What's the difference between BCEcls and BCEobj?

class ComputeLoss:
# Compute losses
def init(self, model, autobalance=False):
super(ComputeLoss, self).init()
device = next(model.parameters()).device # get model device
h = model.hyp # hyperparameters

    # Define criteria
    BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device))
    BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['obj_pw']], device=device))

Autosplit: How can I shuffle img before split it directly ? I am concerned without shuffle would lead to distribution bias in train/val

def autosplit(path='../coco128', weights=(0.9, 0.1, 0.0)): # from utils.datasets import ; autosplit('../coco128')
""" Autosplit a dataset into train/val/test splits and save path/autosplit_.txt files
# Arguments
path: Path to images directory
weights: Train, val, test weights (list)
"""
path = Path(path) # images dir
files = list(path.rglob('.'))
n = len(files) # number of files
indices = random.choices([0, 1, 2], weights=weights, k=n) # assign each image to a split
txt = ['autosplit_train.txt', 'autosplit_val.txt', 'autosplit_test.txt'] # 3 txt files
[(path / x).unlink() for x in txt if (path / x).exists()] # remove existing
for i, img in tqdm(zip(indices, files), total=n):
if img.suffix[1:] in img_formats:
with open(path / txt[i], 'a') as f:
f.write(str(img) + '\n') # add image to txt file

后期会有prune的工作吗

后期会有prune的工作吗

help me !

thank you for your contribution

Traceback (most recent call last):
File "/content/flexible-yolov5/scripts/train.py", line 527, in
train(hyp, opt, device, tb_writer, wandb)
File "/content/flexible-yolov5/scripts/train.py", line 77, in train
ckpt = torch.load(weights, map_location=device) # load checkpoint
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 607, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 882, in _load
result = unpickler.load()
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 875, in find_class
return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'models'

backbone加载预训练模型

请问下backbone是否可以加载预训练模型？ @yl305237731

大佬，请帮个忙

您好，我想使用vggrep网络作为backbone，请问大佬我该修改哪些？

mobilenetv3当作Backbone无法导出onnx模型

大佬，您好。感谢您的开源项目，我尝试用mobilenetv3当作Backbone来训练，精度感觉还可以，训完执行deploy/export.py却无法生成onnx的文件，也不报错。请问有什么解决办法吗？

How to replace yolo backbone with mobilenet3 ?

Hi, can you please kindly tell how to assign a specific backbone (eg: mobilenet3) to model ?
Have read models.py under /od/ , still no idea how to assign it?

model.py , train.py , or init.py under /backbone, which one need to be modified when try to train?

Question： Did you compare the metrics (mAP, F1, R, recall ) when you replaced backbone ?

Hi author, thanks for making this freshman-friendly repo, I noticed 'I train yolo with backbone of MobileNetV3, resnet50, shufflenet_v2_x1_0 on my dataset for person detection(27K images).' So curious about the ablation study.

Which backbone is better for small objects?

大佬麻烦看一下There appear to be 6 leaked semaphores to clean up at shutdown

首先膜拜下大佬，我在colab上用自己的数据集训练显示以下错误
fatal: ambiguous argument 'main..origin/master': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git [...] -- [...]'
github: Command 'git rev-list main..origin/master --count' returned non-zero exit status 128.
YOLOv5 c5b7925 torch 1.9.0+cu102 CPU

Namespace(adam=False, batch_size=16, bucket='', cache_images=False, cfg='configs/model_efficientnet.yaml', data='configs/data.yaml', device='', epochs=100, evolve=False, exist_ok=False, global_rank=-1, hyp='configs/hyp.scratch.yaml', image_weights=False, img_size=[640, 640], linear_lr=False, local_rank=-1, log_artifacts=False, log_imgs=16, multi_scale=False, name='exp', noautoanchor=False, nosave=False, notest=False, project='runs/train', quad=False, rect=False, resume=False, save_dir='runs/train/exp5', single_cls=False, sync_bn=False, total_batch_size=16, weights='', workers=8, world_size=1)
Start Tensorboard with "tensorboard --logdir runs/train", view at http://localhost:6006/
2021-07-27 07:36:17.237677: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
./od/models/model.py:22: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
model_config = yaml.load(open(model_config, 'r'))
FPN input channel size: C3 88, C4 248, C5 2816
FPN output channel size: P3 344, P4 256, P5 2816
PAN input channel size: P3 344, P4 256, P5 2816
PAN output channel size: PP3 256, PP4 512, PP5 1024
Scaled weight_decay = 0.0005
Optimizer groups: 345 .bias, 345 conv.weight, 220 other
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 3
wandb: You chose 'Don't visualize my results'
2021-07-27 07:36:39.867593: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
wandb: W&B syncing is set to offline in this directory. Run wandb online or set WANDB_MODE=online to enable cloud syncing.
train: Scanning '/content/Mydata/labels/train' for images and labels... 4302 found, 2 missing, 0 empty, 0 corrupted: 100% 4304/4304 [00:04<00:00, 927.71it/s]
train: New cache created: /content/Mydata/labels/train.cache
val: Scanning '/content/Mydata/labels/valid' for images and labels... 1076 found, 0 missing, 0 empty, 0 corrupted: 100% 1076/1076 [00:01<00:00, 942.23it/s]
val: New cache created: /content/Mydata/labels/valid.cache
Plotting labels...

autoanchor: Analyzing anchors... anchors/target = 4.52, Best Possible Recall (BPR) = 1.0000
Image sizes 640 train, 640 test
Using 2 dataloader workers
Logging results to runs/train/exp5
Starting training for 100 epochs...

 Epoch   gpu_mem       box       obj       cls     total   targets  img_size

0% 0/269 [00:00<?, ?it/s]tcmalloc: large alloc 1258291200 bytes == 0x55c498936000 @ 0x7f72d340db6b 0x7f72d342d379 0x7f726a07526e 0x7f726a0769e2 0x7f72adec39f8 0x7f72adead359 0x7f72adeba1bf 0x7f72adebb5a7 0x7f72adeb5dbb 0x7f72adeb64c7 0x7f72ae51bc62 0x7f72ae36d57b 0x7f72af9b8c01 0x7f72af9b9392 0x7f72adfe156d 0x7f72ada78518 0x7f72ae58e2ba 0x7f72adfdba7b 0x7f72ada711db 0x7f72ae58e21a 0x7f72adfd9fc5 0x7f72ada70daa 0x7f72ae58e552 0x7f72adfe087d 0x7f72c0975026 0x55c25a09b010 0x55c25a09ada0 0x55c25a10f2f9 0x55c25a09cb99 0x55c25a09d1f1 0x55c25a10c318
tcmalloc: large alloc 1258291200 bytes == 0x55c4e3936000 @ 0x7f72d340db6b 0x7f72d342d379 0x7f726a07526e 0x7f726a0769e2 0x7f72ad8e0b49 0x7f72ad8e1897 0x7f72adcbdd89 0x7f72ae422b9a 0x7f72ae405cbe 0x7f72ae00aa05 0x7f72adece86a 0x7f72adeb6594 0x7f72ae51bc62 0x7f72ae36d57b 0x7f72af9b8c01 0x7f72af9b9392 0x7f72adfe156d 0x7f72ada78518 0x7f72ae58e2ba 0x7f72adfdba7b 0x7f72ada711db 0x7f72ae58e21a 0x7f72adfd9fc5 0x7f72ada70daa 0x7f72ae58e552 0x7f72adfe087d 0x7f72c0975026 0x55c25a09b010 0x55c25a09ada0 0x55c25a10f2f9 0x55c25a09cb99
/usr/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
len(cache))
^C

Got 'Segmentation fault' when trained 1/299 epochs , 'resource_tracker: There appear to be 6 leaked semaphore objects to clean up at shutdown'

2*8 G , GPU. I run : python scripts/train.py --batch 16 --epochs 300 --cfg configs/model_mobilenet.yaml
epoch 0 finished , when epoch 1 in process, got warnings as below and the training stopped automatically.

Epoch gpu_mem box obj cls total targets img_size
0/299 5.55G 0.09329 0.01881 0 0.1121 11 640: 100%|██████████| 836/836 [05:06<00:00, 2.73it/s]
Class Images Targets P R [email protected] [email protected]:.95: 100%|██████████| 74/74 [00:35<00:00, 2.11it/s]
all 2.36e+03 2.81e+03 0.000155 0.0121 7.23e-05 1.11e-05
Images sizes do not match. This will causes images to be display incorrectly in the UI.

 Epoch   gpu_mem       box       obj       cls     total   targets  img_size
 1/299     5.53G   0.08231   0.02054         0    0.1029        40       640:  36%|███▌      | 303/836 [01:46<03:02,  2.91it/s]Segmentation fault

(base) user @Debian:~/anaconda3/envs/ultra_YOLOv5/flexible-yolov5-main$ /home/user /anaconda3/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 6 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

怎样在网络中添加注意力机制

怎样将SE,CBAM等注意力机制加入到主干网络中呢，是在backbone中修改程序吗，怎样改呢，.yaml文件需要改么
How to add SE, CBAM and other attention mechanisms to the backbone network?

resnet50打开注意力机制

resnet50打开注意力机制，resnet.py文件中除了打开cbam=True这一行语句，其他的不知道怎样修改呢，一直报错，好像是维度不匹配导致的。