deepvac / deepvac Goto Github PK

View Code? Open in Web Editor NEW

642.0 4.0 103.0 810 KB

PyTorch Project Specification.

License: GNU General Public License v3.0

Python 100.00%

pytorch deepvac python tensorboard quantization amp torchscript onnx ncnn coreml

deepvac's Introduction

DeepVAC

DeepVAC提供了基于PyTorch的AI项目的工程化规范。为了达到这一目标，DeepVAC包含了：

软件工程规范：软件工程规范；
代码规范：代码规范；
deepvac库：deepvac库。

诸多PyTorch AI项目的内在逻辑都大同小异，因此DeepVAC致力于把更通用的逻辑剥离出来，从而使得工程代码的准确性、易读性、可维护性上更具优势。

如果想使得AI项目符合DeepVAC规范，需要仔细阅读DeepVAC标准。如果想了解deepvac库的设计，请阅读deepvac库的设计。

如何基于DeepVAC构建自己的PyTorch AI项目

1. 阅读DeepVAC标准

可以粗略阅读，建立起第一印象。

2. 环境准备

DeepVAC的依赖有：

Python3。不支持Python2，其已被废弃；
依赖包：torch, torchvision, tensorboard, scipy, numpy, cv2, Pillow；

这些依赖使用pip命令（或者git clone）自行安装，不再赘述。

对于普通用户来说，最方便高效的方式还是使用MLab HomePod作为DeepVAC的使用环境，这是一个预构建的Docker image，可以帮助用户省掉不必要的环境配置时间。同时在MLab组织内部，我们也使用MLab HomePod进行日常的模型的训练任务。

3. 安装deepvac库

可以使用pip来进行安装：
pip3 install deepvac
或者
python3 -m pip install deepvac

如果你需要使用deepvac在github上的最新代码，就需要使用如下的开发者模式：

开发者模式

克隆该项目到本地：git clone https://github.com/DeepVAC/deepvac
在你的入口文件中添加：

import sys
#replace with your local deepvac directory
sys.path.insert(0,'/home/gemfield/github/deepvac')

或者设置PYTHONPATH环境变量：

export PYTHONPATH=/home/gemfield/github/deepvac

4. 创建自己的PyTorch项目

初始化自己项目的git仓库；
在仓库中创建第一个研究分支，比如分支名为 LTS_b1_aug9_movie_video_plate_130w；
切换到上述的LTS_b1分支中，开始工作；

5. 编写配置文件

配置文件的文件名均为 config.py，位于你项目的根目录。在代码开始处添加from deepvac import new, AttrDict；所有用户的配置都存放在这个文件里。config模块提供了6个预定义的作用域：config.core,config.aug,config.cast,config.datasets,config.backbones,config.loss。使用方法如下：

所有和trainer相关（包括train、val、test）的配置都定义在config.core.<my_train_class>中；
所有和deepvac.aug中增强模块相关的配置都定义在config.aug.<my_aug_class>中；
所有和模型转换相关的配置都定义在config.cast.<the_caster_class>中；
所有和Datasets相关的配置都定义在config.datasets.<my_dataset_class>中；
所有和loss相关的配置都定义在config.loss.<my_loss_class>中；
用户可以开辟自己的作用域，比如config.my_stuff = AttrDict()，然后config.my_stuff.name = 'gemfield'；
用户可以使用new()来初始化config实例，使用clone()来深拷贝config配置项。

更多配置：

预训练模型加载；
checkpoint加载；
tensorboard使用；
TorchScript使用；
转换ONNX；
转换NCNN；
转换CoreML；
转换TensorRT；
转换TNN；
转换MNN；
开启量化；
开启EMA；
开启自动混合精度训练(AMP)；

以及关于配置文件的更详细解释，请阅读config说明.

项目根目录下的train.py中用如下方式引用config.py文件:

from config import config as deepvac_config
from deepvac import DeepvacTrain

class MyTrain(DeepvacTrain):
    ......

my_train = MyTrain(deepvac_config)
my_train()

项目根目录下的test.py中用如下方式引用config.py文件:

from config import config as deepvac_config
from deepvac import Deepvac

class MyTest(Deepvac)
    ......

my_test = MyTest(deepvac_config)
my_test()

之后，train.py/test.py代码中通过如下方式来读写config.core中的配置项

print(self.config.log_dir)
print(self.config.batch_size)
......

此外，鉴于config的核心作用，deepvac还设计了如下的API来方便对config模块的使用：

AttrDict
new
interpret
fork
clone

from deepvac import AttrDict, new, interpret, fork

关于这些API的使用方法，请访问config API 说明.

6. 编写synthesis/synthesis.py（可选）

编写该文件，用于产生本项目的数据集，用于对本项目的数据集进行自动化检查和清洗。这一步为可选，如果有需要的话，可以参考Deepvac组织下Synthesis2D项目的实现。

7. 编写aug/aug.py（可选）

编写该文件，用于实现数据增强策略。 deepvac.aug模块为数据增强设计了特有的语法，在两个层面实现了复用：aug 和 composer。比如说，我想复用添加随机斑点的SpeckleAug：

from deepvac.aug.base_aug import SpeckleAug

这是对底层aug算子的复用。我们还可以直接复用别人写好的composer，并且是以直截了当的方式。比如deepvac.aug提供了一个用于人脸检测数据增强的RetinaAugComposer：

from deepvac.aug import RetinaAugComposer

以上说的是直接复用，但项目中更多的是自定义扩展，而且大部分情况下也需要复用torchvision的transform的compose，又该怎么办呢？这里解释下，composer是deepvac.aug模块的概念，compose是torchvision transform模块的概念，之所以这么相似纯粹是因为巧合。

要扩展自己的composer也是很简单的，比如我可以自定义一个composer（我把它命名为GemfieldComposer），这个composer可以使用/复用以下增强逻辑：

torchvision transform定义的compose；
deepvac内置的aug算子；
我自己写的aug算子。

更详细的步骤请访问：deepvac.aug模块使用

8. 编写Dataset类

代码编写在data/dataloader.py文件中。继承deepvac.datasets类体系，比如FileLineDataset类提供了对如下train.txt这种格式的封装：

#train.txt，第一列为图片路径，第二列为label
img0/1.jpg 0
img0/2.jpg 0
...
img1/0.jpg 1
...
img2/0.jpg 2
...

有时第二列是字符串，并且想把FileLineDataset中使用Image读取图片对方式替换为cv2，那么可以通过如下的继承方式来重新实现：

from deepvac.datasets import FileLineDataset

class FileLineCvStrDataset(FileLineDataset):
    def _buildLabelFromLine(self, line):
        line = line.strip().split(" ")
        return [line[0], line[1]]

    def _buildSampleFromPath(self, abs_path):
        #we just set default loader with Pillow Image
        sample = cv2.imread(abs_path)
        sample = self.compose(sample)
        return sample

哦，FileLineCvStrDataset也已经是deepvac.datasets中提供的类了。

9. 编写训练和验证脚本

在Deepvac规范中，train.py就代表了训练范式。模型训练的代码写在train.py文件中，继承DeepvacTrain类：

from deepvac import DeepvacTrain

class MyTrain(DeepvacTrain):
    pass

继承DeepvacTrain的子类可能需要重新实现以下方法才能够开始训练：

类的方法（*号表示用户一般要重新实现）	功能	备注
preEpoch	每轮Epoch之前的用户操作，DeepvacTrain啥也不做	用户可以重新定义（如果需要的话）
preIter	每个batch迭代之前的用户操作，DeepvacTrain啥也不做	用户可以重新定义（如果需要的话）
postIter	每个batch迭代之后的用户操作，DeepvacTrain啥也不做	用户可以重新定义（如果需要的话）
postEpoch	每轮Epoch之后的用户操作，DeepvacTrain啥也不做	用户可以重新定义（如果需要的话）
doFeedData2Device	DeepvacTrain把来自dataloader的sample和target(标签)移动到device设备上	用户可以重新定义（如果需要的话）
doForward	DeepvacTrain会进行网络推理，推理结果赋值给self.config.output成员	用户可以重新定义（如果需要的话）
doLoss	DeepvacTrain会使用self.config.output和self.config.target进行计算得到此次迭代的loss	用户可以重新定义（如果需要的话）
doBackward	网络反向传播过程，DeepvacTrain会调用self.config.loss.backward()进行BP	用户可以重新定义（如果需要的话）
doOptimize	网络权重更新的过程，DeepvacTrain会调用self.config.optimizer.step()	用户可以重新定义（如果需要的话）
doSchedule	更新学习率的过程，DeepvacTrain会调用self.config.scheduler.step()	用户可以重新定义（如果需要的话）
* doValAcc	在val模式下计算模型的acc，DeepvacTrain啥也不做	用户一般要重新定义，写tensorboard的时候依赖于此

典型的写法如下：

class MyTrain(DeepvacTrain):
    ...
    #因为基类不能处理list类型的标签，重写该方法
    def doFeedData2Device(self):
        self.config.target = [anno.to(self.config.device) for anno in self.config.target]
        self.config.sample = self.config.sample.to(self.config.device)

    #初始化config.core.acc
    def doValAcc(self):
        self.config.acc = your_acc
        LOG.logI('Test accuray: {:.4f}'.format(self.config.acc))


train = MyTrain(deepvac_config)
train()

10. 编写测试脚本

在Deepvac规范中，test.py就代表测试范式。测试代码写在test.py文件中，继承Deepvac类。

和train.py中的train/val的本质不同在于：

舍弃train/val上下文；
网络不再使用autograd上下文；
不再进行loss、反向、优化等计算；
使用Deepvac的*Report模块来进行准确度、速度方面的衡量；

继承Deepvac类的子类必须（重新）实现以下方法才能够开始测试：

类的方法（*号表示必需重新实现）	功能	备注
preIter	每个batch迭代之前的用户操作，Deepvac啥也不做	用户可以重新定义（如果需要的话）
postIter	每个batch迭代之后的用户操作，Deepvac啥也不做	用户可以重新定义（如果需要的话）
doFeedData2Device	Deepvac把来自dataloader的sample和target(标签)移动到device设备上	用户可以重新定义（如果需要的话）
doForward	Deepvac会进行网络推理，推理结果赋值给self.config.output成员	用户可以重新定义（如果需要的话）
doTest	用户完全自定义的test逻辑，可以通过report.add(gt, pred)添加测试结果，生成报告	看下面的测试逻辑

典型的写法如下：

class MyTest(Deepvac):
    ...
    def doTest(self):
        ...

test = MyTest(deepvac_config)
test()
#test(input_tensor)

当执行test()的时候，DeepVAC框架会按照如下的优先级进行测试：

如果用户传递了参数，比如test(input_tensor)，则将针对该input_tensor进行doFeedData2Device + doForward，然后测试结束；
如果用户重写了doTest()函数，则将执行doTest()，然后测试结束；
如果用户配置了config.my_test.test_loader，则将迭代该loader，对每个sample进行doFeedData2Device + doForward，然后测试结束；
以上都不符合，报错退出。

DeepVAC的社区产品

产品名称	简介	当前版本	获取方式/部署形式
DeepVAC	独树一帜的PyTorch工程规范	0.6.0	pip install deepvac
libdeepvac	独树一帜的PyTorch模型部署框架	1.9.0	SDK，下载 & 解压
MLab HomePod	迄今为止最先进的容器化PyTorch模型训练环境	2.0	docker run / k8s
MLab RookPod	迄今为止最先进的成本10万人民币以下的存储解决方案	NA	硬件规范 + k8s yaml
pyRBAC	基于Keycloak的RBAC python实现	NA	pip install(敬请期待)
DeepVAC版PyTorch	为MLab HomePod pro版本定制的PyTorch包	1.9.0	conda install -c gemfield pytorch
DeepVAC版LibTorch	为libdeepvac定制的LibTorch库	1.9.0	压缩包，下载 & 解压

deepvac's People

Contributors

Stargazers

Watchers

Forkers

wjj962464 1icas shareonejw milin0802 zrealshadow asdlei99 skyfishmoon ski133 dianlanlan raegher david9998 rqbrother krainlai wilbur-lqw xrosliang stefanleong colionx pandamax czyczyczy lvzhiqiang ceazer0123 linrds hezichuanqi xxxgp buptlihang forths boy-be-ambitious zqfang monoloxo uynajgi thesuguser jiangyuewu wyh163 wesleyhuang2014 zjnlxk sunpro108 cvlzw gongchenghhu saber5433 niko-zyf haojunyong cqlouis 15805658608 uefall lijiunderstand lyf6 victor-future frankhe303 xiebinbb ha0wan9 pengchao61 fuhaogreat fortuneseeker xingl-cv stq-hydra alzhao24 zggl klonggan sssyousen zoumaguanxin genff theroadofsky tonylibing tensor-song zylove006 jasonyank zhangpeng182 qiu-2020 zlh1992 zhaodadu kinsozheng stu-github scutzhe chongzeyu jasoncnsh wqr319 dph1983 joeyzyz herolin12 feizi cs123951 kio2019 qjiewang aimicm artorias-ruizi mio0922 kingqiuol longlongvip charlesxrwu hanniballam majorma-cqc heeenrrry csjunxu iq-scm chopsticksbo rui736040987 utopia124 fanfanfeng myxiaoyu laozhanger

deepvac's Issues

应DeepVAC需求需要对PyTorch进行的提升

在最小化mobile编译的基础上，更小化编译；
图模式QAT；

版本TODO列表

0.3.4

增加cuda arch 8.5;

[BUG]CoreML模型转换双线差值上采样问题

bug描述
在转化coreml过程中，如果代码中存在pytorch双线差值上采样，在ios13上会报错，在ios14上可以顺利执行。

如何复现
复现步骤:

# -*- coding:utf-8 -*-
import torch
from torch.nn import functional as F


class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        ...

    def forward(self, x):
        return F.interpolate(x, scale_factor=2, mode="bilinear", align_corners=True)


if __name__ == "__main__":
    import coremltools

    net = MyModule()
    sample = torch.randn(1, 3, 288, 288)
    net = torch.jit.trace(net, sample).eval()

    sample = coremltools.TensorType(name="input", shape=(1, 3, 288, 288))
    coreml_model = coremltools.convert(model=net, inputs=[sample], minimum_deployment_target=coremltools.target.iOS13)

错误信息

WARNING:root:scikit-learn version 0.22.2.post1 is not supported. Minimum required version: 0.17. Maximum required version: 0.19.2. Disabling scikit-learn conversion API.
2021-08-11 16:39:44.386794: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:root:TensorFlow version 2.4.1 detected. Last version known to be fully compatible is 2.3.1 .
WARNING:root:Keras version 2.4.3 detected. Last version known to be fully compatible of Keras is 2.2.4 .
Converting Frontend ==> MIL Ops:  83%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                              | 5/6 [00:00<00:00, 4505.16 ops/s]
Running MIL optimization passes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 16908.73 passes/s]
Translating MIL ==> MLModel Ops: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 20213.51 ops/s]
Traceback (most recent call last):
  File "converter.py", line 23, in <module>
    coreml_model = coremltools.convert(model=net, inputs=[sample], minimum_deployment_target=coremltools.target.iOS13)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 189, in convert
    check_deployment_compatibility(
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/_deployment_compatibility.py", line 140, in check_deployment_compatibility
    raise ValueError(msg)
ValueError: Provided minimum deployment target requires model to be of version 4 but converted model uses following features which are available from version 5 onwards.
    1. Upsample operation with Align Corners mode

环境

宿主机 cpu/ram/cuda设备： [比如. intel i5-9300H/8GB/GTX 1650 ]
宿主机操作系统/内核版本/GPU驱动： [比如. ubuntu 20.04/5.4.0-80-generic/460.91.03
coremltools == 4.1
torch == 1.8.1
python == 3.8.10

[BUG]Error Code 4: Internal Error (Internal error: plugin node ScatterND_12 requires 36 bytes of scratch space, but only 0 is available.

bug描述
yolov5转换tensorrt过程中，如果网络中有存在以下现象，就会发生错误：

索引操作在等号左边
切片操作在等号左边

如何复现
复现步骤:

最小代码

import torch

# init module
class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        ...

    def forward(self, x):
        x[0] += 1
        return x

torch_model = MyModule()

# torch.onnx.export
torch.onnx.export(torch_model,
    torch.randn(1, 3, 256, 416),
    "./tmp.onnx",
    input_names=["inputs"],
    output_names=["outputs"],
    opset_version=11)

# onnx simplify
import os
import onnx
from onnxsim import simplify

onnx_file = os.path.join(os.getcwd(), "tmp.onnx")
model_op, check_ok = simplify(onnx_file, 
    check_n=3, 
    perform_optimization=True, 
    skip_fuse_bn=True,  
    skip_shape_inference=False, 
    input_shapes={"inputs": (1, 3, 256, 416)}, 
    skipped_optimizers=None, 
    )
onnx.save(model_op, "./tmp.onnx")

# onnx -> tensorrt
# !!!
# you should build tensorrt first
import tensorrt as trt

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
    with open(onnx_file, 'rb') as model:
        parser.parse(model.read())

    config = builder.create_builder_config()
    engine = builder.build_engine(network, config)

    with open("tmp.trt", "wb") as f:
        f.write(engine.serialize())

看，错误就是这个......

Checking 0/3...
Checking 1/3...
Checking 2/3...
[TensorRT] WARNING: onnx2trt_utils.cpp:320: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
mini_code.py:52: DeprecationWarning: Use build_serialized_network instead.
  engine = builder.build_engine(network, config)
[TensorRT] WARNING: Convolution + generic activation fusion is disable due to incompatible driver or nvrtc
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] ERROR: 4: [pluginV2Builder.cpp::makeRunner::680] Error Code 4: Internal Error (Internal error: plugin node ScatterND_12 requires 36 bytes of scratch space, but only 0 is available. Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().
)
Traceback (most recent call last):
  File "mini_code.py", line 55, in <module>
    f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize'

预期结果

Checking 0/3...
Checking 1/3...
Checking 2/3...
mini_code.py:53: DeprecationWarning: Use build_serialized_network instead.
  engine = builder.build_engine(network, config)
[TensorRT] WARNING: Convolution + generic activation fusion is disable due to incompatible driver or nvrtc
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1

截图

如果使用的是MLab HomePod，请填写

宿主机 cpu/ram/cuda设备： [比如. intel i5-9300H/8GB/GTX1650]
宿主机操作系统/内核版本/GPU驱动： [比如. ubuntu 20.04/5.4.0-77-generic/460.80 ]
MLab HomePod版本 [比如： 2.0-pro]

上下文备注

# init module
class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        ...

    def forward(self, x):
        y = x[:1] + 1
        return y

这种情况下不会报错
暂时猜测当索引切片操作在等号左边时tensorrt的onnxParser就会报错？

temporary: the only valid use of a module is looking up an attribute but found = prim::SetAttr[name="output"](%self, %x.1)

🐞Describe the bug

torch_model -> torch.jit.script -> coreml
I got this error while try to setattr in forward.

Trace

Traceback (most recent call last):
  File "mini_code.py", line 21, in <module>
    model = ct.convert(
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 175, in convert
    mlmodel = mil_convert(
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 128, in mil_convert
    proto = mil_convert_to_proto(model, convert_from, convert_to,
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 171, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 85, in __call__
    return load(*args, **kwargs)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 70, in load
    converter = TorchConverter(torchscript, inputs, outputs, cut_at_symbols)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 145, in __init__
    raw_graph, params_dict = self._expand_and_optimize_ir(self.torchscript)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 262, in _expand_and_optimize_ir
    graph, params = _torch._C._jit_pass_lower_graph(
RuntimeError: 
temporary: the only valid use of a module is looking up an attribute but found  = prim::SetAttr[name="output"](%self, %x.1)
:

To Reproduce

import torch
import coremltools as ct

# init maxpool module
class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        self.output: torch.Tensor = torch.empty(1)

    def forward(self, x):
        self.output = x
        return 

torch_model = MyModule()

# script
script_model = torch.jit.script(torch_model)

# Convert to Core ML using the Unified Conversion API
model = ct.convert(
    script_model,
    inputs=[ct.ImageType(name="input", shape=(1, 3, 224, 224))],
)

System environment (please complete the following information):

coremltools version (e.g., 3.0b5): 4.1
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- pytorch version: 1.9.0
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

[BUG] 转换的TNN模型在华为麒麟处理器上opencl（GPU）比cpu速度慢

bug描述
我们将ESP网络转换成TNN模型部署在华为手机上和骁龙处理器手机上。在骁龙处理器手机上，GPU/opencl 是cpu模型推理速度的2倍，而在华为麒麟980手机上，GPU/opencl 比cpu模型推理速度还要慢（从13fps下降到10fps）。

如何复现
复现步骤:

config.py 中配置ESP网络，打开转换TNN模型的开关；
运行test.py输出TNN模型；
集成到安卓项目中，然后安装到华为麒麟980手机上；
测试摄像头输入时的fps速度。

预期结果
华为麒麟980手机上，GPU/opencl 应该大于等于cpu的推理速度。

截图
如果有必要的话，请添加截图。

如果使用的是MLab HomePod，请填写

宿主机 cpu/ram/cuda设备： intel i9-9820X/32GB/RTX2080ti
宿主机操作系统/内核版本/GPU驱动：ubuntu 20.04/5.4.0-74-generic/460.80
MLab HomePod版本： 2.0-pro

标题请以占位符[BUG]开头

bug描述
请描述该bug。
如何复现
复现步骤:

步骤1xxxx
步骤2xxxx
步骤3xxxx
看，错误就是这个......

预期结果
请描述你原本期望的结果是什么。

截图
如果有必要的话，请添加截图。

如果使用的是MLab HomePod，请填写

宿主机 cpu/ram/cuda设备： [比如. intel i9-9820X/32GB/RTX2080ti ]
宿主机操作系统/内核版本/GPU驱动： [比如. ubuntu 20.04/5.4.0-74-generic/460.80 ]
MLab HomePod版本 [比如： 2.0-pro]

如果使用的不是MLab HomePod，请填写

宿主机 cpu/ram/cuda设备： [比如. intel i9-9820X/32GB/RTX2080ti ]
宿主机操作系统/内核版本/GPU驱动： [比如. ubuntu 20.04/5.4.0-74-generic/460.80
所有出错时刻调用栈中用到的三方包及其版本。

上下文备注
任何你觉得有必要说明的上下文.

coreml转换器在numpy 1.20上会报错

由上游PyTorch引入的问题

DeepVAC把这些问题划分为两类：

阻塞性问题；
可以绕过的问题。

阻塞性问题

在DDP模式中，训练任务不支持再开启trace和script。解决方案：等待上游PyTorch添加新功能；
量化感知训练（QAT）不支持图模式，因此需要手工修改网络，参考https://zhuanlan.zhihu.com/p/349019936 所述。解决方案：等待上游PyTorch添加新功能；
开启script_model_dir + static_quantize_dir得到的量化模型,在运行时报错(trace_model_dir + static_quantize_dir似乎没有问题)。解决方案：等待上游PyTorch的fix;
图模式量化下，emit upsample的问题；

可以绕过的问题

静态库没有安装到install目录下的问题；
nccl_static、kineto库的问题；
静态编译下，导出变量不能包含cuda共享库的问题；

TypeError: 'Proxy' object cannot be interpreted as an integer

🐛 Bug

I get an error

loop in forward and loop count num from input x
call torch.quantization.quantize_fx.prepare_fx

To Reproduce

Steps to reproduce the behavior:

code example

import torch
from torch.quantization import get_default_qconfig
from torch.quantization.quantize_fx import prepare_fx

# init module
class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        ...

    def forward(self, x):
        for i in range(x.size(1)):
            x += 1
        return 

torch_model = MyModule().eval()

# fx
s_qconfig_dict = {'': get_default_qconfig("fbgemm")}
prepare_fx(torch_model, s_qconfig_dict)

stack traces

Traceback (most recent call last):
  File "mini_code.py", line 22, in <module>
    prepare_fx(torch_model, s_qconfig_dict)
  File "/opt/conda/lib/python3.8/site-packages/torch/quantization/quantize_fx.py", line 392, in prepare_fx
    return _prepare_fx(model, qconfig_dict, prepare_custom_config_dict)
  File "/opt/conda/lib/python3.8/site-packages/torch/quantization/quantize_fx.py", line 174, in _prepare_fx
    graph_module = GraphModule(model, tracer.trace(model))
  File "/opt/conda/lib/python3.8/site-packages/torch/fx/symbolic_trace.py", line 571, in trace
    self.create_node('output', 'output', (self.create_arg(fn(*args)),), {},
  File "mini_code.py", line 14, in forward
    for i in range(x.size(1)):
TypeError: 'Proxy' object cannot be interpreted as an integer

Expected behavior

Environment

PyTorch Version: 1.9.0
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

temporary: the only valid use of a module is looking up an attribute but found = prim::SetAttr[name="num_batches_tracked"](%self, %11)

🐞Describe the bug

I got this error while:
nn.BatchNorm2d -> torch.jit.script -> coreml

Trace

Traceback (most recent call last):
  File "mini_code.py", line 13, in <module>
    model = ct.convert(
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 175, in convert
    mlmodel = mil_convert(
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 128, in mil_convert
    proto = mil_convert_to_proto(model, convert_from, convert_to,
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 171, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 85, in __call__
    return load(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 70, in load
    converter = TorchConverter(torchscript, inputs, outputs, cut_at_symbols)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 145, in __init__
    raw_graph, params_dict = self._expand_and_optimize_ir(self.torchscript)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 262, in _expand_and_optimize_ir
    graph, params = _torch._C._jit_pass_lower_graph(
RuntimeError: 
temporary: the only valid use of a module is looking up an attribute but found  = prim::SetAttr[name="num_batches_tracked"](%self, %11)
:

To Reproduce

import torch
import coremltools as ct

# init torch module
torch_model = torch.nn.BatchNorm2d(3)

# script
script_model = torch.jit.script(torch_model)

# Convert to Core ML using the Unified Conversion API
model = ct.convert(
    script_model,
    inputs=[ct.ImageType(name="input", shape=(1, 3, 224, 224))],
)

System environment (please complete the following information):

coremltools version (e.g., 3.0b5): 4.1
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- pytorch version: 1.9.0
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

Error Code 1: Cask (isConsistent)

question

I get this error while convert module to tensorrt

module has 5 down sample
upsample at last down sample
torch.cat

To Reproduce

Steps to reproduce the behavior:

code example

import torch
import torch.nn.functional as F

class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        for i in range(1, 6):
            setattr(self, f"down{i}", torch.nn.Conv2d(3, 3, 3, 2, padding=1))

    def forward(self, x):
        x1 = self.down1(x)
        x2 = self.down2(x1)
        x3 = self.down3(x2)
        x4 = self.down4(x3)
        x5 = self.down5(x4)
        return torch.cat([x4, F.interpolate(x5, scale_factor=2)], 1)

torch_model = MyModule()

# torch.onnx.export
torch.onnx.export(torch_model,
    torch.randn(1, 3, 224, 224),
    "./tmp.onnx",
    input_names=["inputs"],
    output_names=["outputs"],
    dynamic_axes={"inputs": {0: "batch", 2: "height", 3: "width"}, "outputs": {0: "batch", 1: "class", 2: "height", 3: "width"}},
    opset_version=11,
    export_params=True)

import os
onnx_file = os.path.join(os.getcwd(), "tmp.onnx")

# onnx -> tensorrt
# !!!
# you should build tensorrt first
import tensorrt as trt

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
    with open(onnx_file, 'rb') as model:
        parser.parse(model.read())

    config = builder.create_builder_config()

    profile = builder.create_optimization_profile()
    profile.set_shape("inputs", (1, 3, 1, 1), (1, 3, 224, 224), (1, 3, 2000, 2000))
    config.add_optimization_profile(profile)

    engine = builder.build_engine(network, config)
    with open("tmp.trt", "wb") as f:
        f.write(engine.serialize())

stack traces

sometimes i failed and get this

mini_code.py:54: DeprecationWarning: Use build_serialized_network instead.
  engine = builder.build_engine(network, config)
[TensorRT] WARNING: Convolution + generic activation fusion is disable due to incompatible driver or nvrtc
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] ERROR: 1: [convolutionBuilder.cpp::createConvolution::184] Error Code 1: Cask (isConsistent)
Traceback (most recent call last):
  File "mini_code.py", line 56, in <module>
    f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize'

sometimes i succeed and get this

mini_code.py:54: DeprecationWarning: Use build_serialized_network instead.
  engine = builder.build_engine(network, config)
[TensorRT] WARNING: Convolution + generic activation fusion is disable due to incompatible driver or nvrtc
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] WARNING: Max value of this profile is not valid
[TensorRT] WARNING: Min value of this profile is not valid
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1

Expected behavior

Environment

TensorRT Version: 8.0.0.3
PyTorch Version: 1.9.0
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

CoreML -- AttributeError: 'torch._C.Node' object has no attribute 'ival'

🐞Describe the bug

I got this error when convert coremlmodel after torch.jit.freeze

Trace

Traceback (most recent call last):
  File "mini_code.py", line 15, in <module>
    model = ct.convert(
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 175, in convert
    mlmodel = mil_convert(
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 128, in mil_convert
    proto = mil_convert_to_proto(model, convert_from, convert_to,
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 171, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 85, in __call__
    return load(*args, **kwargs)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 70, in load
    converter = TorchConverter(torchscript, inputs, outputs, cut_at_symbols)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 146, in __init__
    self.graph = InternalTorchIRGraph(
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/internal_graph.py", line 241, in __init__
    new_node = InternalTorchIRNode(raw_node, parent=self)
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/internal_graph.py", line 140, in __init__
    self.attr = {
  File "/home/liyang/.local/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/internal_graph.py", line 141, in <dictcomp>
    name: getattr(node, node.kindOf(name))(name)
AttributeError: 'torch._C.Node' object has no attribute 'ival'

To Reproduce

If a python script can reproduce the error, please paste the code snippet

import torch
import coremltools as ct

# init maxpool module
torch_model = torch.nn.Conv2d(3, 3, 1, 1)

# Trace with random data
example_input = torch.rand(1, 3, 224, 224) 
trace_model = torch.jit.trace(torch_model, example_input).eval()
freeze_model = torch.jit.freeze(trace_model)

# Convert to Core ML using the Unified Conversion API
model = ct.convert(
    freeze_model,
    inputs=[ct.ImageType(name="input", shape=example_input.shape)], 
)

System environment (please complete the following information):

coremltools version (e.g., 3.0b5): 4.1
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- pytorch version: 1.9.0
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

torch.fx.proxy.TraceError: symbolically traced variables cannot be used as inputs to control flow

🐛 Bug

I get an error

if-else in forward method
call torch.quantization.quantize_fx.prepare_fx

To Reproduce

Steps to reproduce the behavior:

code example

import torch
from torch.quantization import get_default_qconfig
from torch.quantization.quantize_fx import prepare_fx

# init module
class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        ...

    def forward(self, x):
        if x.size(1) != 3:
            return 
        return 

torch_model = MyModule().eval()

# fx
s_qconfig_dict = {'': get_default_qconfig("fbgemm")}
prepare_fx(torch_model, s_qconfig_dict)

stack traces

Traceback (most recent call last):
  File "mini_code.py", line 22, in <module>
    prepare_fx(torch_model, s_qconfig_dict)
  File "/opt/conda/lib/python3.8/site-packages/torch/quantization/quantize_fx.py", line 392, in prepare_fx
    return _prepare_fx(model, qconfig_dict, prepare_custom_config_dict)
  File "/opt/conda/lib/python3.8/site-packages/torch/quantization/quantize_fx.py", line 174, in _prepare_fx
    graph_module = GraphModule(model, tracer.trace(model))
  File "/opt/conda/lib/python3.8/site-packages/torch/fx/symbolic_trace.py", line 571, in trace
    self.create_node('output', 'output', (self.create_arg(fn(*args)),), {},
  File "mini_code.py", line 14, in forward
    if x.size(1) != 3:
  File "/opt/conda/lib/python3.8/site-packages/torch/fx/proxy.py", line 199, in __bool__
    return self.tracer.to_bool(self)
  File "/opt/conda/lib/python3.8/site-packages/torch/fx/proxy.py", line 129, in to_bool
    raise TraceError('symbolically traced variables cannot be used as inputs to control flow')
torch.fx.proxy.TraceError: symbolically traced variables cannot be used as inputs to control flow

Expected behavior

Environment

PyTorch Version: 1.9.0
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

tensorrt转换器报错

当打开tensorrt转换器开关后，转换逻辑报错。

开启tensorboard后，deepvac日志会打印2份

开启tensorboard后，deepvac日志会打印2份。这应该是tensorboard包的log handler和deepvac包的log handler冲突了。

void onnxruntime::BroadcastIterator::Init(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 3 by 224

🐛 Bug

I get an error

reigster_buffer in Module
onnxruntime on onnx from torch.onnx.export

To Reproduce

Steps to reproduce the behavior:

code example

import torch

# init module
class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        self.register_buffer("rb", torch.randn(1, 1, 3, 1, 1)) 
        # you can run with the workaround
        # self.rb = torch.randn(1, 1, 3, 1, 1)

    def forward(self, x):
        x += self.rb[0]
        return x

torch_model = MyModule().eval()

# torch.onnx.export
torch.onnx.export(torch_model,
        torch.randn(1, 3, 224, 224),
        "./tmp.onnx",
        input_names=["inputs"],
        output_names=["outputs"],
        dynamic_axes={"inputs": {0: "batch", 2: "height", 3: "width"}, "outputs": {0: "batch", 1: "class", 2: "height", 3: "width"}},
        opset_version=11,
        export_params=True)

# onnxruntime
import os
import numpy as np
import onnxruntime
from onnxruntime.datasets import get_example

onnx_model = get_example(os.path.join(os.getcwd(), "tmp.onnx"))
sess = onnxruntime.InferenceSession(onnx_model)
inputs = np.random.randn(1, 3, 224, 224).astype(np.float32)
onnx_out = sess.run(None, {"inputs": inputs})

stack traces

Warning: ONNX Preprocess - Removing mutation from node aten::add_ on block input: '0'. This changes graph semantics.
2021-07-07 20:06:52.126767616 [E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running Add node. Name:'Add_0' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:497 void onnxruntime::BroadcastIterator::Init(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 3 by 224

Traceback (most recent call last):
  File "mini_code.py", line 37, in <module>
    onnx_out = sess.run(None, {"inputs": inputs})
  File "/opt/conda/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 188, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Add node. Name:'Add_0' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:497 void onnxruntime::BroadcastIterator::Init(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 3 by 224

Expected behavior

Environment

PyTorch Version: 1.9.0
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

请问一下，这个框架还会继续维护吗？

CoreML -- TypeError: Unsupported numpy type: float32

🐞Describe the bug

I got this error when convert coremlmodel in numpy >= 1.20

Trace

Traceback (most recent call last):
  File "mini_code.py", line 14, in <module>
    model = ct.convert(
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 175, in convert
    mlmodel = mil_convert(
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 128, in mil_convert
    proto = mil_convert_to_proto(model, convert_from, convert_to,
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 171, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 85, in __call__
    return load(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 83, in load
    raise e
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 73, in load
    prog = converter.convert()
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 223, in convert
    const = mb.const(val=val, mode=mode, name=name)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/mil/ops/registry.py", line 62, in add_op
    return cls._add_op(op_cls, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/mil/builder.py", line 189, in _add_op
    new_op.type_value_inference()
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/mil/operation.py", line 240, in type_value_inference
    output_types = self.type_inference()
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/mil/ops/defs/control_flow.py", line 140, in type_inference
    builtin_type, _ = self._get_type_val(self.val.val)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/mil/ops/defs/control_flow.py", line 180, in _get_type_val
    _, builtin_type = numpy_val_to_builtin_val(value)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/mil/types/type_mapping.py", line 262, in numpy_val_to_builtin_val
    builtintype = numpy_type_to_builtin_type(npval.dtype)
  File "/opt/conda/lib/python3.8/site-packages/coremltools/converters/mil/mil/types/type_mapping.py", line 232, in numpy_type_to_builtin_type
    raise TypeError("Unsupported numpy type: %s" % (nptype))
TypeError: Unsupported numpy type: float32

To Reproduce

import torch
import coremltools as ct

# init maxpool module
torch_model = torch.nn.Conv2d(3, 3, 1, 1)

# Trace with random data
example_input = torch.rand(1, 3, 224, 224) 
trace_model = torch.jit.trace(torch_model, example_input).eval()

# Convert to Core ML using the Unified Conversion API
model = ct.convert(
    trace_model,
    inputs=[ct.ImageType(name="input", shape=example_input.shape)], 
)

System environment (please complete the following information):

coremltools version (e.g., 3.0b5): 4.1
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- pytorch version: 1.9.0
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

[BUG]没有看到量化训练的地方阿？

bug描述
请描述该bug。
如何复现
复现步骤:

步骤1xxxx
步骤2xxxx
步骤3xxxx
看，错误就是这个......

预期结果
请描述你原本期望的结果是什么。

截图
如果有必要的话，请添加截图。

如果使用的是MLab HomePod，请填写

宿主机 cpu/ram/cuda设备： [比如. intel i9-9820X/32GB/RTX2080ti ]
宿主机操作系统/内核版本/GPU驱动： [比如. ubuntu 20.04/5.4.0-74-generic/460.80 ]
MLab HomePod版本 [比如： 2.0-pro]

如果使用的不是MLab HomePod，请填写

宿主机 cpu/ram/cuda设备： [比如. intel i9-9820X/32GB/RTX2080ti ]
宿主机操作系统/内核版本/GPU驱动： [比如. ubuntu 20.04/5.4.0-74-generic/460.80
所有出错时刻调用栈中用到的三方包及其版本。

上下文备注
任何你觉得有必要说明的上下文.

Demo Projects Behind This Repo

Hi! Really appreciated for this work being available online. I tried to run DeepVAC/yolov5 project in your HomePod container. However, I found that the DeepVAC/yolov5 code seemed outdated with regard to this repo (The module abstraction were not the same, some classes' name and location were changed). Do you have any plan to make demo projects in space DeepVAC up-to-date so that beginners can start learning DeepVAC easily? Many thanks.

NCNN: Unsupported unsqueeze axes !

🐛 Bug

I get this error while convert yolov5 Focus module to ncnn

To Reproduce

Steps to reproduce the behavior:

code example

import torch

# init module
class Focus(torch.nn.Module):
    def __init__(self):
        super(Focus, self).__init__()
        ...

    def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
        x = torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)
        return x

torch_model = Focus().eval()

# torch.onnx.export
torch.onnx.export(torch_model,
        torch.randn(1, 3, 224, 224),
        "./tmp.onnx",
        input_names=["inputs"],
        output_names=["outputs"],
        dynamic_axes={"inputs": {0: "batch", 2: "height", 3: "width"}, "outputs": {0: "batch", 1: "class", 2: "height", 3: "width"}},
        opset_version=11,
        export_params=True)

# onnx simplify
import os
import onnx
from onnxsim import simplify

onnx_file = os.path.join(os.getcwd(), "tmp.onnx")
model_op, check_ok = simplify(onnx_file, 
        check_n=3, 
        perform_optimization=True, 
        skip_fuse_bn=True,  
        skip_shape_inference=False, 
        input_shapes={"inputs": (1, 3, 224, 224)}, 
        skipped_optimizers=None, 
        dynamic_input_shape={"inputs": {0: "batch", 2: "height", 3: "width"}, "outputs": {0: "batch", 1: "class", 2: "height", 3: "width"}})
onnx.save(model_op, "./tmp.onnx")

# onnx -> ncnn
# !!!
# you should build onnx2ncnn binary file first
os.system("/bin/onnx2ncnn {} tmp.params tmp.bin".format(onnx_file))

stack traces

Checking 0/3...
Checking 1/3...
Checking 2/3...
Unsupported slice step !
Unsupported slice step !
Unsupported slice step !
Unsupported slice step !
Unsupported slice step !
Unsupported slice step !
Unsupported slice step !
Unsupported slice step !

Expected behavior

Environment

PyTorch Version: 1.9.0
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

deepvac / deepvac Goto Github PK

deepvac's Introduction

DeepVAC

如何基于DeepVAC构建自己的PyTorch AI项目

1. 阅读DeepVAC标准

2. 环境准备

3. 安装deepvac库

开发者模式

4. 创建自己的PyTorch项目

5. 编写配置文件

6. 编写synthesis/synthesis.py（可选）

7. 编写aug/aug.py（可选）

8. 编写Dataset类

9. 编写训练和验证脚本

10. 编写测试脚本

DeepVAC的社区产品

deepvac's People

Contributors

Stargazers

Watchers

Forkers

deepvac's Issues

0.3.4

🐞Describe the bug

Trace

To Reproduce

System environment (please complete the following information):

阻塞性问题

可以绕过的问题

🐛 Bug

To Reproduce

Expected behavior

Environment

🐞Describe the bug

Trace

To Reproduce

System environment (please complete the following information):

question

To Reproduce

Expected behavior

Environment

🐞Describe the bug

Trace

To Reproduce

System environment (please complete the following information):

🐛 Bug

To Reproduce

Expected behavior

Environment

🐛 Bug

To Reproduce

Expected behavior

Environment

🐞Describe the bug

Trace

To Reproduce

System environment (please complete the following information):

🐛 Bug

To Reproduce

Expected behavior

Environment

Recommend Projects

Recommend Topics

Recommend Org