Comments (30)
@wjli-debug 这样应该不太行,相当于跳跃连接结构没了,网络深了就训练不出来了。想想为啥提出resnet
from flexible-yolov5.
那么对于这个的解决思路是不是应该是训练的时候要一直保持dep参数为false,那么等训练完成之后再转为推理结构,而不能在创造repvgg结构的时候直接使用deploy=True?
from flexible-yolov5.
因为我看到repvgg官方有提供switch_to_deploy转化的代码,应该是训练的时候要保持deploy参数为false,即一直分支训练,到完成之后再单独进行convert转化为单支结构
from flexible-yolov5.
@wjli-debug 是的。这个网络这么设计是为了推理时加速。训练的时候残差链接还是很重要的,不然网络深了就崩了
from flexible-yolov5.
但是我在将保存使用repvgg训练保存的best.pt进行转化为推理结构时一直出现问题,无论是使用官方给出的还是自己写然后加载在调用都有下面问题的存在,作者有时间看一下这个是什么原因造成的吗?
Traceback (most recent call last):
File "convert_1.py", line 6, in
model.load_state_dict(torch.load('best.pt'))
File "/data1/docker_project/ENV/flex_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RepVGG:
from flexible-yolov5.
出现l了保存的权重缺失了许多参数,这是由于什么原因导致的呀?
from flexible-yolov5.
发个权重给我吧
from flexible-yolov5.
链接:https://pan.baidu.com/s/1Evl3vG_hmPOJU_G-g76S-w
提取码:qwer
昨天我看这两个权重,best权重是使用yolov5框架训练生成,另一个是repvgg官方提供的权重文件;两者在结构上存在不同,我昨天想在train.py中保存为best部分代码前将权重修改为下述:
if (not nosave) or (final_epoch and not evolve): # if save
ckpt = {
'epoch': epoch,
'best_fitness': best_fitness,
'model': deepcopy(de_parallel(model)),
'ema': deepcopy(ema.ema),
'updates': ema.updates,
'optimizer': optimizer.state_dict(),
'wandb_id': loggers.wandb.wandb_run.id if loggers.wandb else None,
'date': datetime.now().isoformat()}
# Save last, best and delete
# Save last, best and delete
torch.save(ckpt, last)
if best_fitness == fi:
# Switch to deploy mode
model.switch_to_pretrained() # Add this line
torch.save(ckpt, best)
if opt.save_period > 0 and epoch % opt.save_period == 0:
torch.save(ckpt, w / f'epoch{epoch}.pt')
del ckpt
callbacks.run('on_model_save', last, epoch, final_epoch, best_fitness, fi)
保存前引入model.switch_to_pretrained()进行转化,目前正在重新训练,不确定是否能成功
from flexible-yolov5.
内网无法使用百度云,邮箱吧, [email protected]
from flexible-yolov5.
backbone = get_RepVGG_func_by_name('RepVGG-A0')()
pretrained_dict = torch.load('../best.pt')
backbone_state_dict = pretrained_dict['model'].backbone.state_dict()
print(backbone_state_dict)
我使用上述代码获取了保存的best.pt中model的backbone的参数矩阵,如果直接对model中的backbone进行转换是否会破环整个best.pt的model结构?
from flexible-yolov5.
转换是指?
from flexible-yolov5.
呃,没有成功,还是出现了一些bug,貌似这样不太行
from flexible-yolov5.
转换是将分支改成单路结构
from flexible-yolov5.
原本训练的backbone不是分支结构的吗?想在保存之后单独将backbone转换成单路结构的权重值,再赋值给model的backbone
from flexible-yolov5.
我理解是不行的,状态字典大概率匹配不上
from flexible-yolov5.
作者有什么建议或者方法吗?
from flexible-yolov5.
我正在训练一个网络,然后会尝试转换结构
from flexible-yolov5.
好的,麻烦大佬了
from flexible-yolov5.
我刚刚训练了一下,拿保存的权重以部署的方式导出onnx 是没有问题的:先以带分支的结构构件网络,然后加载权重,然后调用重参数化接口。我会更新一下export_onnx.py
from flexible-yolov5.
还有一点,如果你想加载别人预训练的权重,需要确认他们的状态字典是怎么保存的:带分支还是不带分支。你需要以相应的模型状态去加载别人的预训练权重
from flexible-yolov5.
嗯,好的,多谢作者,我去尝试一下,看看效果
from flexible-yolov5.
我运行查看了一下,并将deploy=True,但是下述代码好像并没有将分支全部重参数为3*3
if deploy:
for name, module in model.named_modules():
if hasattr(module, 'switch_to_deploy'):
module.switch_to_deploy()
from flexible-yolov5.
它虽然导出为部署模型,但是仍然是多分支结构;但是repvgg应该在导出作为推理时是要单分支结构,上述代码发现没起到效果
from flexible-yolov5.
我知道原因了,加载预训练权重失败是他们的key 和我的命名不一样,只需要在 load_state_dict(, strict=False) 即可,不让检查名字匹配
from flexible-yolov5.
switch_to_deploy 这个函数默认没有的,我刚加上的,默认是 switch_to_pretrain, 不太符合部署命名,我加了一个函数
from flexible-yolov5.
load_state_dict(, strict=False) ,这个是在哪部分添加?是单独加载best.pt进行转换吗?
from flexible-yolov5.
不是特别懂作者说的是在哪部分,我看到你提供的代码中并没有load_state_dict部分
from flexible-yolov5.
暴力一点,在train.py 129 行,直接加上 model.backbone.load_state_dict(torch.load('下载的预训练权重'), strict=False) 只要权重shape 没问题的话就能直接加载进去了
from flexible-yolov5.
加载预训练模型确实需要一点改动,需要把下载的预训练模型的 key 改成此repo 对应的才行。后续有时间我再看看吧 @wjli-debug
from flexible-yolov5.
嗯,好的
from flexible-yolov5.
Related Issues (20)
- 0 gradients 问题 HOT 2
- ModuleNotFoundError: No module named 'od.models'; 'od' is not a package HOT 1
- 数据增强 HOT 1
- 更改backbone HOT 1
- 有关head的问题 HOT 3
- 请问yaml文件中有关head的网络结构 HOT 4
- 请问模型的flops应该在哪输出 HOT 3
- 作者大大是否有跟进YOLOv5 V7.0版本实例分割功能的打算? HOT 2
- About the Resume HOT 1
- 使用mobilenetv3作为主干网络,无法导出onnx,并且没有错误提示。 HOT 5
- Add True negative data to training dataset HOT 2
- 使用替换backbone为swimtransformer的yolov5时,detect的时候,只能框住物体,不能显示置信度和种类 HOT 2
- ONNX: export failure: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument other in method wrapper__equal) HOT 5
- error during compile HOT 11
- 求教 HOT 1
- Drop layer HOT 1
- Error help please HOT 3
- repvgg做为backbone时,保存的pt模型是多分支结构还是单路结构? HOT 5
- delete
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flexible-yolov5.