Coder Social home page Coder Social logo

bobo-y / flexible-yolov5 Goto Github PK

View Code? Open in Web Editor NEW
649.0 10.0 118.0 12.77 MB

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt

License: GNU General Public License v3.0

Python 95.86% Dockerfile 0.26% CMake 0.18% C++ 3.69%
yolov5 resnet moblienet backbone neck cbam pytorch shufflenet hrnet dcnv2

flexible-yolov5's Introduction

flexible-yolov5

Update the code for ultralytics/yolov5 version 6.1.

代码基于U版YOLOv5 6.1版本. 根据 {backbone, neck, head} 重新组织了网络结构, 目前backbone 除了原始的YOLO外,还可选择 resnet, hrnet, swin-transformer, gnn, mobilenet 等主流backbone. 同时也可以自由的加入 SE, DCN, drop block 等插件. 可以很方便的对网络结构等进行替换、修改、实验. 同时提供了tensorrt 的c++、Python 推理, 量化. 以及Triton、tf_serving 部署代码. 每个backbone只选了一个训练300个epoch做对比,均无预训练权重,由于网络结构不同,我的结果并不能代表网络最终的结果,可以作为一个baseline参考. 这个项目适合想要各种改YOLO或者验证模块. 是如果你有什么好的idea,比如增加新的backbone, 插件等, 欢迎提PR, 使用时遇到什么问题, 也欢迎提issue. 如果对你有帮助, 感谢给颗♥(ˆ◡ˆԅ)小 ⭐️⭐️.

Split the yolov5 model to {backbone, neck, head} to facilitate the operation of various modules and support more backbones.Basically, only change the model, and I didn't change the architecture, training and testing of yolov5. Therefore, if the original code is updated, it is also very convenient to update this code. if you have some new ideas, you can give a pull request, add new features together。 if this repo can help you, please give me a star.

Table of contents

Features

  • support nvidia tensor-core 4:2 sparsity.
  • support QAT, for qat onnx export, need torch >=1.13(I only test on this version) 2023-10-17
  • update PTQ code , 2023-10-01
  • Reorganize model structure, such as backbone, neck, head, can modify the network flexibly and conveniently
  • mobilenetV3-small, mobilenetV3-large
  • shufflenet_v2_x0_5, shufflenet_v2_x1_0, shufflenet_v2_x1_5, shufflenet_v2_x2_0
  • yolov5s, yolov5m, yolov5l, yolov5x, yolov5transformer
  • resnet18, resnet50, resnet34, resnet101, resnet152
  • efficientnet_b0 - efficientnet_b8, efficientnet_l2
  • hrnet 18,32,48
  • CBAM, SE
  • swin transformer - base, tiny, small, large (please set half=False in scripts/eval.py and don't use model.half in train.py)
  • DCN (mixed precision training not support, if you want use dcn, please close amp in line 292 of scripts/train.py)
  • coord conv
  • drop_block
  • vgg, repvgg
  • tensorrt c++/python infer, triton server infer
  • gnn backbone

Notices

  • The CBAM, SE, DCN, coord conv. At present, the above plug-ins are not added to all networks, so you may need to modify the code yourself.
  • The default gw and gd for PAN and FPN of other backbone are same as yolov5_s, so if you want a strong model, please modify self.gw and self.gd in FPN and PAN.
  • resnet with dcn, training on gpu *RuntimeError: expected scalar type Half but found Float: please remove the mixed precision training in line 351 of scripts/train.py
  • swin-transformer, training is ok, but testing report *RuntimeError: expected object of scalar type Float but got scalar type Half for argument #2 'mat2' in call to_th_bmm_out in swin_trsansformer.py. please set half=False in script/eval.py
  • mobilenet export onnx failed, please replace HardSigmoid() by others, because onnx don't support pytorch nn.threshold

Bugs

None

Prerequisites

please refer requirements.txt

Getting Started

Dataset Preparation

Make data for yolov5 format. you can use od/data/transform_voc.py convert VOC data to yolov5 data format.

Training and Testing

For training and Testing, it's same like yolov5.

Training

  1. check out configs/data.yaml, and replace with your data, and number of object nc
  2. check out configs/model_*.yaml, choose backbone. and change nc to your dataset. please refer support_backbone in models.backbone.init.py
$ python scripts/train.py  --batch 16 --epochs 5 --data configs/data.yaml --cfg configs/model_XXX.yaml
# for nvidia tensor-core 4:2 sparsity, install apex

git clone https://github.com/NVIDIA/apex
cd apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

A google colab demo in train_demo.ipynb

Testing and Visualize

$ python scripts/eval.py   --data configs/data.yaml  --weights runs/train/yolo/weights/best.py

Model performance comparison with different backbone

For some reasons, I can't provide the pretrained weight, only the comparison results. Sorry!

All checkpoints are trained to 300 epochs with default settings, all backbones without pretrained weights. Yolov5 Nano and Small models use hyp.scratch-low.yaml hyps, all others use hyp.scratch-high.yaml. The mAP of the validation come to the last epoch, maybe not the best.

flexible-yolov5 model with different backbones size
(pixels)
mAPval
0.5:0.95
mAPval
0.5
params
[flexible-YOLOv5n](https://pan.baidu.com/s/1UAvEmgWmpxA3oPm5CJ8C-g 提取码: kg22) 640 25.7 43.3 1872157
[flexible-YOLOv5s](https://pan.baidu.com/s/1ImN2ryMK3IPy8_St-Rzxhw 提取码: pt8i) 640 35 54.7 7235389
[flexible-YOLOv5m] 640 42.1 62 21190557
[flexible-YOLOv5l] 640 45.3 65.3 46563709
[flexible-YOLOv5x] 640 47 66.7 86749405
others backbone
[mobilnet-v3-small] 640 21.9 37.6 3185757
[resnet-18] 640 34.6 53.7 14240445
[shufflenetv2-x1_0] 640 27.8 45.1 4297569
[repvgg-A0] 640
[vgg-16bn] 640 35.2 56.4 17868989
[efficientnet-b1] 640 38.1 58.6 9725597
[swin-tiny] 640 39.2 60.5 30691127
[gcn-tiny] 640 33.8 55.5 131474444
resnet with plug-in
[resnet-18-cbam] 640 35.2 55.5 15620399
[resnet-18-dcn] 640

Detection

python scripts/detector.py   --weights yolov5.pth --imgs_root  test_imgs   --save_dir  ./results --img_size  640  --conf_thresh 0.4  --iou_thresh 0.4

Deploy

Export

python scripts/export.py   --weights yolov5.pth 

Grpc Server

In projects folder, tf_serving and triton demo are provided.

Quantization

You can directly quantify the onnx model

python scripts/trt_quant/generate_int8_engine.py --onnx path --images-dir  img_path  --save-engine  engine_path

See

Tensorrt Inference

For tensorrt model, you can direct use official trt export, and refer scripts/trt_infer/cpp/. For test, I use TensorRT-8.4.0.6.

privode c++ / python demo, scripts/trt_infer

Reference

↳ Contributors

↳ Stargazers

Stargazers repo roster for @Bobo-y/flexible-yolov5

↳ Forkers

Forkers repo roster for @Bobo-y/flexible-yolov5

flexible-yolov5's People

Contributors

bobo-y avatar huster-hq avatar marco-nguyen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flexible-yolov5's Issues

RepVGG模型加载

您好,我尝试把RepVGG的backbone加到YOLOv5中,但现在报错信息如下:
Traceback (most recent call last):
File "/home/xxx/Desktop/flexible-yolov5/scripts/train.py", line 531, in
train(hyp, opt, device, tb_writer)
File "/home/xxx/Desktop/flexible-yolov5/scripts/train.py", line 91, in train
model = Model(opt.cfg).to(device) # create
File "/home/xxx/Desktop/flexible-yolov5/od/models/model.py", line 26, in init
backbone_out = self.backbone.out_shape
File "/home/xxx/Softwares/anaconda/envs/welding/lib/python3.7/site-packages/torch/nn/modules/module.py", line 948, in getattr
type(self).name, name))
AttributeError: 'RepVGG' object has no attribute 'out_shape'
您知道应该怎么修改模型吗?

detector.py应该如何使用?

您好,
我使用下面的代码来加载训练好的模型:

from detector import *

pt_path = 'E:\\flexible-yolov5\\scripts\\best.pt'
img = 'E:\\flexible-yolov5\\source\\00-20.3592.jpg'
class_pth = 'E:\\flexible-yolov5\\source\\classes.txt'

inter = Detector(pt_path=pt_path, namesfile=class_pth, img_size=576,classes=6)
result = inter(img)
print(result)

但是出现报错:

Traceback (most recent call last):
  File "E:/flexible-yolov5/scripts/detector.py", line 131, in <module>
    inter = Detector(pt_path=pt_path, namesfile=class_pth, img_size=576,classes=6)
  File "E:/flexible-yolov5/scripts/detector.py", line 16, in __init__
    self.model = self.load_model()
  File "E:/flexible-yolov5/scripts/detector.py", line 25, in load_model
    model = attempt_load(self.pt_path, map_location='cpu')  # load FP32 model
  File "E:\flexible-yolov5\od\models\modules\experimental.py", line 118, in attempt_load
    model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval())  # load FP32 model
  File "E:\flexible-yolov5\venv\lib\site-packages\torch\serialization.py", line 592, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "E:\flexible-yolov5\venv\lib\site-packages\torch\serialization.py", line 851, in _load
    result = unpickler.load()
ModuleNotFoundError: No module named 'models'

我应该如何解决这个问题呢?谢谢。

What's depth gain ? width gain? Can anyone tell why the values in gains are .0.33, 0.5,,,,1.33, 1.25 ?

class YOLOv5(nn.Module):
def init(self, focus=True, version='L'):
super(YOLOv5, self).init()
self.version = version
self.with_focus = focus

    gains = {'s': {'gd': 0.33, 'gw': 0.5},
             'm': {'gd': 0.67, 'gw': 0.75},
             'l': {'gd': 1, 'gw': 1},
             'x': {'gd': 1.33, 'gw': 1.25}}
    self.gd = gains[self.version.lower()]['gd']  # depth gain
    self.gw = gains[self.version.lower()]['gw']  # width gain

what's # depth gain , # width gain ?

AttributeError: 'PosixPath' object has no attribute 'tell'

Hello
In the process of reproducing your code, I encountered the following problem:
Traceback (most recent call last):
File "scripts/train.py", line 527, in
train(hyp, opt, device, tb_writer, wandb)
File "scripts/train.py", line 189, in train
image_weights=opt.image_weights, quad=opt.quad, prefix=colorstr('train: '))
File "/home/lab/liyuping/flexible-yolov5-main/flexible-yolov5-main/od/data/datasets.py", line 71, in create_dataloader
prefix=prefix)
File "/home/lab/liyuping/flexible-yolov5-main/flexible-yolov5-main/od/data/datasets.py", line 377, in init
cache, exists = torch.load(cache_path), True # load
File "/home/lab/anaconda3/envs/yolo/lib/python3.7/site-packages/torch/serialization.py", line 527, in load
with _open_zipfile_reader(f) as opened_zipfile:
File "/home/lab/anaconda3/envs/yolo/lib/python3.7/site-packages/torch/serialization.py", line 224, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
AttributeError: 'PosixPath' object has no attribute 'tell'

Do you know how to solve it?Look forward to your reply.
Thanks

eval.py的使用问题

您好,我在eval.py中使用了--task speed参数,出现了如下报错:

  File "eval.py", line 325, in <module>
    test(opt.data, w, opt.batch_size, opt.img_size, 0.25, 0.45, save_json=False, plots=False)
  File "eval.py", line 56, in test
    model = attempt_load(weights, map_location=device)  # load FP32 model
  File "/home/xxx/Desktop/flexible-yolov5/od/models/modules/experimental.py", line 118, in attempt_load
    model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval())  # load FP32 model
  File "/home/xxx/Softwares/anaconda/envs/welding/lib/python3.8/site-packages/torch/serialization.py", line 579, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/home/xxx/Softwares/anaconda/envs/welding/lib/python3.8/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/home/xxx/Softwares/anaconda/envs/welding/lib/python3.8/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
IsADirectoryError: [Errno 21] Is a directory: '/'

val和test参数是没有问题的,请问这是什么原因造成的?

Comparing speed of MobileNet and Yolov5_small

Hello
Thanks for sharing your great work.
I compared the speed of Mobilnet_small with YoloV5_small. I noticed that MobileNet is considerably slower than YoloV5_Small.
Is it normal? I was expecting to obtain a higher speed out of MobileNet in comparison to YoloV5
Thanks

Why do we need to remove batchnorm after fuse?

def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
print('Fusing layers... ')
for module in [self.backbone, self.fpn, self.pan, self.detection]:
for m in module.modules():
if type(m) is Conv and hasattr(m, 'bn'):
m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
delattr(m, 'bn') # remove batchnorm
m.forward = m.fuseforward # update forward
self.info()
return self

delattr(m, 'bn') # remove batchnorm can you please help explain why it's necessary to remove batchnorm here?

模型保存问题

hi,训练好的模型保存下来的用netron软件看不到网络结构,这个好像只保存了参数
image

修改了代码按照原版的保存模型,结果还是一样,请问是我哪里错了吗
image

RuntimeError: expected scalar type Half but found Float

Thanks for your great work. I met this problem when I use resnet as backbone.

Traceback (most recent call last):
File "train.py", line 527, in
train(hyp, opt, device, tb_writer, wandb)
File "train.py", line 293, in train
pred = model(imgs) # forward
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/Workings/flexible-yolov5-main/od/models/model.py", line 67, in forward
out = self.backbone(x)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/Workings/flexible-yolov5-main/od/models/backbone/resnet.py", line 208, in forward
x2 = self.layer2(x1)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/Workings/flexible-yolov5-main/od/models/backbone/resnet.py", line 119, in forward
out = self.conv2(out, offset)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jx/anaconda3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py", line 143, in forward
return deform_conv2d(input, offset, self.weight, self.bias, stride=self.stride,
File "/home/jx/anaconda3/lib/python3.8/site-packages/torchvision/ops/deform_conv.py", line 76, in deform_conv2d
return torch.ops.torchvision.deform_conv2d(
RuntimeError: expected scalar type Half but found Float

How to infer a folder of images?

Your code is a great job and very rewarding.
But again, I see some differences between your code and authentic yolov5, and I would like to know how to do inference of a batch file (generate inference boxes, inference categories and inference values)?
I see that the original yolov5 has a detect.py file, but your codebase does not have it inside.

raise Exception('Dataset not found.')

I put custom data as described in ultralytics/yolov5. When I tried to run your train.py , got error messages as below:

WARNING: Dataset not found, nonexistent paths: [/home/user/flexible-yolov5-main/dataset/person_count_val/images/train']
Traceback (most recent call last):
File "train.py", line 527, in
train(hyp, opt, device, tb_writer, wandb)
File "train.py", line 65, in train
check_dataset(data_dict) # check
File "/home/flexible-yolov5-main/utils/general.py", line 126, in check_dataset
raise Exception('Dataset not found.')
Exception: Dataset not found.

Then I check train.py line 65 :
train_path = data_dict['train']
test_path = data_dict['test']

How can I modify values in data_dict so that my custom data can be accessed by train.py ?

预训练权重

训练有预训练权重吗?还是说训练的时候随机初始化?

转trt失败

转trt的时候报错ERROR: INVALID_ARGUMENT: getPluginCreator could not find plugin ScatterND version 1
Completed parsing of ONNX file,麻烦问下您遇到过这个问题么

triton部署问题

这个部署的说明文档要怎么来弄

$ python  export.py
$ cp best.onnx projects/triton_server_deploy/models/yolov5s/1/
$ cd projects/triton_server_deploy/ 
$ docker build . -t "head:v1"
$ docker run -itd --gpus '"device=1"' -p 8080:8080 -p 8006:8006 -p 8081:8081 -p 8082:8082 -p 7070:7070 -p 7071:7071 --name head -v /data/share/imageAlgorithm/zhangcheng/2021/flexible-yolov5/projects:/data/share/imageAlgorithm/zhangcheng/2021/flexible-yolov5/projects bb3bfdeccc2f /bin/bash
# 新建的head容器一直无法开启
docker start head

bug
新建容器

请问我应该按照哪一个文档来建立triton_server
谢谢

What's the difference between BCEcls and BCEobj?

class ComputeLoss:
# Compute losses
def init(self, model, autobalance=False):
super(ComputeLoss, self).init()
device = next(model.parameters()).device # get model device
h = model.hyp # hyperparameters

    # Define criteria
    BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device))
    BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['obj_pw']], device=device))

Autosplit: How can I shuffle img before split it directly ? I am concerned without shuffle would lead to distribution bias in train/val

def autosplit(path='../coco128', weights=(0.9, 0.1, 0.0)): # from utils.datasets import ; autosplit('../coco128')
""" Autosplit a dataset into train/val/test splits and save path/autosplit_
.txt files
# Arguments
path: Path to images directory
weights: Train, val, test weights (list)
"""
path = Path(path) # images dir
files = list(path.rglob('.'))
n = len(files) # number of files
indices = random.choices([0, 1, 2], weights=weights, k=n) # assign each image to a split
txt = ['autosplit_train.txt', 'autosplit_val.txt', 'autosplit_test.txt'] # 3 txt files
[(path / x).unlink() for x in txt if (path / x).exists()] # remove existing
for i, img in tqdm(zip(indices, files), total=n):
if img.suffix[1:] in img_formats:
with open(path / txt[i], 'a') as f:
f.write(str(img) + '\n') # add image to txt file

help me !

thank you for your contribution

Traceback (most recent call last):
File "/content/flexible-yolov5/scripts/train.py", line 527, in
train(hyp, opt, device, tb_writer, wandb)
File "/content/flexible-yolov5/scripts/train.py", line 77, in train
ckpt = torch.load(weights, map_location=device) # load checkpoint
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 607, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 882, in _load
result = unpickler.load()
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 875, in find_class
return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'models'

大佬,请帮个忙

您好,我想使用vggrep网络作为backbone,请问大佬我该修改哪些?

mobilenetv3当作Backbone无法导出onnx模型

大佬,您好。感谢您的开源项目,我尝试用mobilenetv3当作Backbone来训练,精度感觉还可以,训完执行deploy/export.py却无法生成onnx的文件,也不报错。请问有什么解决办法吗?

How to replace yolo backbone with mobilenet3 ?

Hi, can you please kindly tell how to assign a specific backbone (eg: mobilenet3) to model ?
Have read models.py under /od/ , still no idea how to assign it?

model.py , train.py , or init.py under /backbone, which one need to be modified when try to train?

大佬麻烦看一下There appear to be 6 leaked semaphores to clean up at shutdown

首先膜拜下大佬,我在colab上用自己的数据集训练显示以下错误
fatal: ambiguous argument 'main..origin/master': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git [...] -- [...]'
github: Command 'git rev-list main..origin/master --count' returned non-zero exit status 128.
YOLOv5 c5b7925 torch 1.9.0+cu102 CPU

Namespace(adam=False, batch_size=16, bucket='', cache_images=False, cfg='configs/model_efficientnet.yaml', data='configs/data.yaml', device='', epochs=100, evolve=False, exist_ok=False, global_rank=-1, hyp='configs/hyp.scratch.yaml', image_weights=False, img_size=[640, 640], linear_lr=False, local_rank=-1, log_artifacts=False, log_imgs=16, multi_scale=False, name='exp', noautoanchor=False, nosave=False, notest=False, project='runs/train', quad=False, rect=False, resume=False, save_dir='runs/train/exp5', single_cls=False, sync_bn=False, total_batch_size=16, weights='', workers=8, world_size=1)
Start Tensorboard with "tensorboard --logdir runs/train", view at http://localhost:6006/
2021-07-27 07:36:17.237677: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
./od/models/model.py:22: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
model_config = yaml.load(open(model_config, 'r'))
FPN input channel size: C3 88, C4 248, C5 2816
FPN output channel size: P3 344, P4 256, P5 2816
PAN input channel size: P3 344, P4 256, P5 2816
PAN output channel size: PP3 256, PP4 512, PP5 1024
Scaled weight_decay = 0.0005
Optimizer groups: 345 .bias, 345 conv.weight, 220 other
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 3
wandb: You chose 'Don't visualize my results'
2021-07-27 07:36:39.867593: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
wandb: W&B syncing is set to offline in this directory. Run wandb online or set WANDB_MODE=online to enable cloud syncing.
train: Scanning '/content/Mydata/labels/train' for images and labels... 4302 found, 2 missing, 0 empty, 0 corrupted: 100% 4304/4304 [00:04<00:00, 927.71it/s]
train: New cache created: /content/Mydata/labels/train.cache
val: Scanning '/content/Mydata/labels/valid' for images and labels... 1076 found, 0 missing, 0 empty, 0 corrupted: 100% 1076/1076 [00:01<00:00, 942.23it/s]
val: New cache created: /content/Mydata/labels/valid.cache
Plotting labels...

autoanchor: Analyzing anchors... anchors/target = 4.52, Best Possible Recall (BPR) = 1.0000
Image sizes 640 train, 640 test
Using 2 dataloader workers
Logging results to runs/train/exp5
Starting training for 100 epochs...

 Epoch   gpu_mem       box       obj       cls     total   targets  img_size

0% 0/269 [00:00<?, ?it/s]tcmalloc: large alloc 1258291200 bytes == 0x55c498936000 @ 0x7f72d340db6b 0x7f72d342d379 0x7f726a07526e 0x7f726a0769e2 0x7f72adec39f8 0x7f72adead359 0x7f72adeba1bf 0x7f72adebb5a7 0x7f72adeb5dbb 0x7f72adeb64c7 0x7f72ae51bc62 0x7f72ae36d57b 0x7f72af9b8c01 0x7f72af9b9392 0x7f72adfe156d 0x7f72ada78518 0x7f72ae58e2ba 0x7f72adfdba7b 0x7f72ada711db 0x7f72ae58e21a 0x7f72adfd9fc5 0x7f72ada70daa 0x7f72ae58e552 0x7f72adfe087d 0x7f72c0975026 0x55c25a09b010 0x55c25a09ada0 0x55c25a10f2f9 0x55c25a09cb99 0x55c25a09d1f1 0x55c25a10c318
tcmalloc: large alloc 1258291200 bytes == 0x55c4e3936000 @ 0x7f72d340db6b 0x7f72d342d379 0x7f726a07526e 0x7f726a0769e2 0x7f72ad8e0b49 0x7f72ad8e1897 0x7f72adcbdd89 0x7f72ae422b9a 0x7f72ae405cbe 0x7f72ae00aa05 0x7f72adece86a 0x7f72adeb6594 0x7f72ae51bc62 0x7f72ae36d57b 0x7f72af9b8c01 0x7f72af9b9392 0x7f72adfe156d 0x7f72ada78518 0x7f72ae58e2ba 0x7f72adfdba7b 0x7f72ada711db 0x7f72ae58e21a 0x7f72adfd9fc5 0x7f72ada70daa 0x7f72ae58e552 0x7f72adfe087d 0x7f72c0975026 0x55c25a09b010 0x55c25a09ada0 0x55c25a10f2f9 0x55c25a09cb99
/usr/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
len(cache))
^C

Got 'Segmentation fault' when trained 1/299 epochs , 'resource_tracker: There appear to be 6 leaked semaphore objects to clean up at shutdown'

2*8 G , GPU. I run : python scripts/train.py --batch 16 --epochs 300 --cfg configs/model_mobilenet.yaml
epoch 0 finished , when epoch 1 in process, got warnings as below and the training stopped automatically.

Epoch gpu_mem box obj cls total targets img_size
0/299 5.55G 0.09329 0.01881 0 0.1121 11 640: 100%|██████████| 836/836 [05:06<00:00, 2.73it/s]
Class Images Targets P R [email protected] [email protected]:.95: 100%|██████████| 74/74 [00:35<00:00, 2.11it/s]
all 2.36e+03 2.81e+03 0.000155 0.0121 7.23e-05 1.11e-05
Images sizes do not match. This will causes images to be display incorrectly in the UI.

 Epoch   gpu_mem       box       obj       cls     total   targets  img_size
 1/299     5.53G   0.08231   0.02054         0    0.1029        40       640:  36%|███▌      | 303/836 [01:46<03:02,  2.91it/s]Segmentation fault

(base) user @Debian:~/anaconda3/envs/ultra_YOLOv5/flexible-yolov5-main$ /home/user /anaconda3/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 6 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

怎样在网络中添加注意力机制

怎样将SE,CBAM等注意力机制加入到主干网络中呢,是在backbone中修改程序吗,怎样改呢,.yaml文件需要改么
How to add SE, CBAM and other attention mechanisms to the backbone network?

resnet50打开注意力机制

resnet50打开注意力机制,resnet.py文件中除了打开cbam=True这一行语句,其他的不知道怎样修改呢,一直报错,好像是维度不匹配导致的。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.