openppl-public / ppq Goto Github PK

View Code? Open in Web Editor NEW

1.5K 17.0 225.0 5.7 MB

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

License: Apache License 2.0

Python 92.45% C++ 4.35% C 0.12% Cuda 3.04% CMake 0.05%

neural-network deep-learning quantization pytorch caffe onnx cuda open-source

ppq's People

Contributors

Stargazers

Watchers

Forkers

zhangzhipku zhenglongjiepheonix joker255x qingswu a1trl9 1013367229 howave 666dzy666 bug1989 jzz24 wdian jawaechan rookiezed achang146 chenbohua3 feigechuanshu mzhang2054 zjysnow lucasjinreal takeshineshiro alanjonson tpoisonooo 17714196157 triple-mu jjjma renyan1998 xukaikai1992 liufqing littleyann zhouleidcc zgq91 zhangluustb tinggh runningleon xavierwonderful w1ndseeker joeyl12138 ycbob dph1983 purblack hyaihjq nextvpu outbreak-hui songjiahao-wq zhengzhuo0309 jie311 jsnobody liuxubit zhenzhong1 fudp yimikai xiguadong xhzzc1994 xudh1991 mackenbaron kio2019 jin11-23 zhiqwang waterdropw zhnguo craft-zhang xialuxi wuxiaolianggit wuzerun-888 liu-b-s roberts-2080 gogo800 xinxin12345 winterxx krisandchris genggng inisis zcl912 nuaasxr xiao2mo lannist faintnj brotherhappy jiangyongyu1 wong00 wenzhu888 neophack megleo lunwk hnuhchen grimoire amanda-barbara q5390498 stephenfang51 ennsou yangzhegithub zhaoxin111 neonho shining-love leayz-888 xhysdjkdsjsk2021 marsmiao ping1jing2 huzicong rachoren

ppq's Issues

module 'numpy' has no attribute 'asscalar'

File "/usr/local/lib/python3.8/dist-packages/ppq-0.6.4-py3.8.egg/ppq/parser/util.py", line 27, in convert_value
value = np.asscalar(value[0])
File "/usr/local/lib/python3.8/dist-packages/numpy/init.py", line 311, in getattr
raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'asscalar'

numpy version 1.23.0

how to export onnx and save quantized onnx model?

Does there any runnable example?

BN是否可以单独被量化

emmmm，又来叨扰你们勒。我在测试一个模型的时候，有这么个计算图：

...--->Add--->BN--->ReLU--->...

量化完成后，我查看了一下json文件，这个BN似乎没有被量化，想叨扰问下BN可以被量化吗，还是说模型不应该有这样单独存在的BN呢

clip 算子导出模型格式错误

量化浮点模型opset-9， ppq量化后导出模型的为opset-13，但导出的模型的clip算子不符合opset-13的算子格式。

模型链接：
链接：https://pan.baidu.com/s/1LarJMbi-d0K0JG0Lrs30Fw
提取码：8888

Graph output of export_ppq_graph is not quantized

Hello, great work!

I try to quantize some ONNX models with ppq. The ONNX model output from export_ppq_graph is not quantized at all, but I found some key parameters for quantization such as scales and zero points for different layers saved in JSON. Is this bug or feature of PPQ? How can I get the quantized ONNX model?

conv pads cause forward error

model：链接：https://pan.baidu.com/s/1IU1KdcSc2Ssxs3zjXY1IHQ
提取码：8888

onnx model "auto_pad" is SAME_LOWER, The pads value corresponding to onnx is [1,1,0,0], format is [x1_begin, x2_begin， x1_end, x2_end] . but it cannot be mapped to the conv operator of pytorch.

PPL_DSP_INT8量化后export问题

GPU模式下，在跑RetinaFace（backbone为ResNet50时），量化过程成功跑完，在导出时报TypeError: Cannot convert Resize_133 to caffe op。debug发现是因为没有满足ppq/parser/caffe/caffe_export_utils.py 的第439行判断而导致的。

'utf-8' codec can't decode byte 0xb4 in position 2833

您好，我在我自己带GPU的电脑试了一下，发现在ppq\core\ffi.py第28行，报错这个'utf-8' codec can't decode byte 0xb4 in position 2833，调了半天一直无法解决，很郁闷，这是什么导致的呢

我用的是visual studio 2019

cos error

混合精度量化支持吗？会结合考虑实际板上推理速度吗？能否推荐一些确实work且好用的论文或方法~？

10 : INVALID_GRAPH : Load model from quant_atom/Output/quantized(onnx).onnx.onnx failed:This is an invalid model. Type Error: Type 'tensor(float)' of input parameter (onnx::Conv_322) of operator (QLinearConv) in node (Conv_17) is invalid.

貌似新的ORTExporter导出的图，不会处理这种情况：

直接就从float32到QLinearConv了。

如何理解TargetPlatform?

首先, 很感谢你们开源为量化这方面做的贡献!
我先说一下我的理解: 这个变量应该是指我要部署的目标平台是什么? 按这个理解, 我可以理解CUDA / NXP / DSP / SNPE 等不同厂商的不同硬件, 甚至于后面接的数据类型: FP32 / FP16 / INT8 / INT4 都可以看成是不同的子硬件平台. 但我惟一感到困惑的是: 为什么要区分PPL_CUDA_INT8 和 TRT_INT8? 两者的应用目标不都是GPU吗? 那这里是隐含说一个是采用PPQ内嵌的量化算法然后直接生成engine模型给GPU部署, 另一个是采用TensorRT内自带量化算法?

Targetplatform enum 值有什么特定的含义么？

比如

ppq/ppq/core/quant.py

Line 99 in d7097a7

ACADEMIC_INT8 = 10081

这个里面的10081 ，如果新增platform时这个值是否可以随意取一个之前没配置过的值？

squeeze 算子 axes参数为list时，forward存在问题。

squeeze算子list时，ppq对 axes = axes[0]，但output shape 会与实际onnx operator不一致，导致后续算子报错。同样问题可能也存在unsqueeze算子中。

我现在是通过for循环避免这个问题

slice op error when axis = -1

当使用ppq量化onnx模型时产生报错：

"ppq/executor/op/torch/default.py" , line 959
    new_axes = [ x if x >= 0 else len(data.dim()) + x for x in axes]
TypeError: object of type 'int' has no len()

貌似是slice的axis为-1时产生的问题

RuntimeError: Error happens when dealing with operation Conv_0

RuntimeError: Error happens when dealing with operation Conv_0(TargetPlatform.NXP_INT8) - inputs:['input', '39', '40'], outputs:['38']

类似以上报错，可以从什么角度触发去debug？

Upsample算子似乎不支持量化；ConTranpose似乎无法完成BN Fold；

a. 在跑一个onnx测试模型时，报错Upsample算子 no bakend on target platform，似乎PPQ还不支持Upsample算子的量化，后续有可能会支持吗

b. 在跑另一个onnx测试模型时，模型中有这么个计算图：
...-->ConvTranspose-->BatchNrom-->ReLU-->...
然后报错ConvTranspose算子无法和BN进行Fold

c. 还想叨扰请教一下，如果计算图为：
...-->BatchNrom-->Conv-->ReLU-->...
那么可以进行Fold吗？

ONNX 模型输入时，reshape / flatten 等涉及维度变换的算子，在维度变换参数固定时，batch 输入的 calibration pass 容易扑街

这可能是一个比较 general 的问题，即有一部分部署输入的 ONNX 模型，可能经过 shape inference 或 simplified 或什么其它的方式，它的 reshape 这类算子输入的 shape 参数维度是固定的。这样就导致 calibration pass 的时候，当输入是 batch，torch executor 就扑街了。

以这个模型为例，Reshape_71 这个算子的 shape 是固定的。

不出意外的，当 batch_size = 32 时，12483460*32=6266880 就 reshape 不到 [1,2,48,34,60]。

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/ppq/executor/torch.py", line 359, in __forward
    outputs = operation_forward_func(operation, inputs, self._executing_contenxt)
  File "/usr/local/lib/python3.6/dist-packages/ppq/executor/op/torch/default.py", line 458, in Reshape_forward
    return data.reshape(shape)
RuntimeError: shape '[1, 2, 48, 34, 60]' is invalid for input of size 6266880

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ppq-entrance.py", line 66, in <module>
    device=DEVICE, verbose=0)
  File "/usr/local/lib/python3.6/dist-packages/ppq/core/defs.py", line 65, in _wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ppq/api/interface.py", line 267, in quantize_onnx_model
    collate_fn=collate_fn
  File "/usr/local/lib/python3.6/dist-packages/ppq/core/defs.py", line 65, in _wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ppq/quantization/quantizer/base.py", line 74, in quantize
    **kwargs
  File "/usr/local/lib/python3.6/dist-packages/ppq/quantization/optim/base.py", line 95, in optimize
    optimization_pass.apply(processer=processer, dataloader=dataloader, executor=executor, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ppq/quantization/optim/base.py", line 30, in apply
    self.optimize(processer, dataloader=dataloader, executor=executor, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ppq/core/defs.py", line 65, in _wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ppq/quantization/optim/calibration.py", line 117, in optimize
    executor=executor, hooks=hooks, output_names=None)
  File "/usr/local/lib/python3.6/dist-packages/ppq/quantization/optim/calibration.py", line 59, in calibrate
    output_names=output_names)
  File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ppq/executor/torch.py", line 231, in forward
    hooks=hooks
  File "/usr/local/lib/python3.6/dist-packages/ppq/executor/torch.py", line 387, in __forward
    raise RuntimeError(f'Error happens when dealing with operation {str(operation)}')
RuntimeError: Error happens when dealing with operation Reshape_71(TargetPlatform.FP32) - inputs:['708', '1894'], outputs:['720']

报错：__CUDA_EXTENTION__

NameError: name 'CUDA_EXTENTION' is not defined
请问怎么解决呢？

请问是否支持操作batch维度？

您好，我有一个输入为[15,3,512,512]的分割模型，在运行ProgramEntrance.py时提示“Error happens when dealing with operation Transpose_93(TargetPlatform.UNSPECIFIED) “，这个transpose是把第0个维度和第1个维度进行交换，请问是否支持这样的操作呢？

请问量化感知训练(qat)怎么使用

您好，非常感谢你们开源的优秀项目。最近想学习一下，我看了下文档，好像没找到感知训练量化的内容(可能是自己粗心没找到)。如果是我没找到希望您能稍微指点一下文档在哪，如果真的没有，想了解下您们的一些想法。谢谢。

关于QuantizationStates.PASSIVE_INIT量化配置探讨

想叨扰问一下这个配置是干什么用的：base_quant_config.input_quantization_config[-1].state = QuantizationStates.PASSIVE_INIT

当执行这个语句时，也就是PASSIVE_INIT生效时，最后量化引入的量化噪声非常严重；但不使用这个语句时，量化噪声就几乎没有了。这是为什么呢，，

code implementation error

Hi, I found an error in https://github.com/openppl-public/ppq/blob/df53c934748cb0ded7bdd0089398f3053265dcc5/ppq/quantization/algorithm/equalization.py#L328

It sholud be torch.mean(torch.square(params), axis=aggerate_axis).

retinanet共享权重存储问题

retinanet模型的head部分，conv共享weight。ppq导出后分别存储，导致相比原始模型扩大100M

RuntimeError: Error building extension 'PPQ_Cuda_Impls': [1/6] :/usr/local/cuda-10.2/bin/nvcc

                           !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (c++) is not compatible with the compiler Pytorch was
built with for this platform, which is g++ on linux. Please
use g++ to to compile your extension. Alternatively, you may
compile PyTorch from source using c++, and then you can also use
c++ to compile your extension.

See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                          !! WARNING !!

warnings.warn(WRONG_COMPILER_WARNING.format(
Traceback (most recent call last):
File "/root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1717, in _run_ninja_build
subprocess.run(
File "/root/anaconda3/envs/pytorch/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/core/ffi.py", line 16, in
CUDA_EXTENTION = load(
File "/root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1124, in load
return jit_compile(
File "/root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1337, in jit_compile
write_ninja_file_and_build_library(
File "/root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1449, in write_ninja_file_and_build_library
run_ninja_build(
File "/root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1733, in run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'PPQ_Cuda_Impls': [1/6] :/usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++14 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/cuda/linear.cu -o linear.cuda.o
FAILED: linear.cuda.o
:/usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++14 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/cuda/linear.cu -o linear.cuda.o
/bin/sh: :/usr/local/cuda-10.2/bin/nvcc: No such file or directory
[2/6] :/usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++14 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/cuda/sort.cu -o sort.cuda.o
FAILED: sort.cuda.o
:/usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++14 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/cuda/sort.cu -o sort.cuda.o
/bin/sh: :/usr/local/cuda-10.2/bin/nvcc: No such file or directory
[3/6] :/usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++14 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/cuda/train.cu -o train.cuda.o
FAILED: train.cuda.o
:/usr/local/cuda-10.2/bin/nvcc -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++14 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/cuda/train.cu -o train.cuda.o
/bin/sh: :/usr/local/cuda-10.2/bin/nvcc: No such file or directory
[4/6] c++ -MMD -MF export.o.d -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/export.cc -o export.o
FAILED: export.o
c++ -MMD -MF export.o.d -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/export.cc -o export.o
c++: error: unrecognized command line option ‘-std=c++14’
[5/6] c++ -MMD -MF hist_mse.o.d -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/cpu/hist_mse.cc -o hist_mse.o
FAILED: hist_mse.o
c++ -MMD -MF hist_mse.o.d -DTORCH_EXTENSION_NAME=PPQ_Cuda_Impls -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/TH -isystem /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/include/THC -isystem :/usr/local/cuda-10.2/include -isystem /root/anaconda3/envs/pytorch/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /root/anaconda3/envs/pytorch/lib/python3.8/site-packages/ppq-0.6.5-py3.8.egg/ppq/csrc/cpu/hist_mse.cc -o hist_mse.o
c++: error: unrecognized command line option ‘-std=c++14’
ninja: build stopped: subcommand failed.

加载数据出现问题

ValueError: cannot reshape array of size 256269 into shape (1,3,480,480)
准备图像时800*1280，加载数据时出现这个问题，该如何解决

A bug when export file

当我在导出的时候platform=TargetPlatform.ONNXRUNTIME会出现以下报错，但是platform=TargetPlatform.ONNX就不会

网络量化结束，正在生成目标文件:
Traceback (most recent call last):
File "/tmp/pycharm_project_280/ppq/programeetrance.py", line 188, in
config_save_to=os.path.join(WORKING_DIRECTORY, 'quant_cfg.json'))
File "/home/wyy/anaconda3/envs/py_37/lib/python3.7/site-packages/ppq/api/interface.py", line 610, in export_ppq_graph
exporter.export(file_path=graph_save_to, config_path=config_save_to, graph=graph, **kwargs)
File "/home/wyy/anaconda3/envs/py_37/lib/python3.7/site-packages/ppq/parser/onnxruntime_exporter.py", line 366, in export
graph = self.prepare_graph(graph)
File "/home/wyy/anaconda3/envs/py_37/lib/python3.7/site-packages/ppq/parser/onnxruntime_exporter.py", line 361, in prepare_graph
quant_param_to_int=quant_parameter_to_int)
File "/home/wyy/anaconda3/envs/py_37/lib/python3.7/site-packages/ppq/parser/onnxruntime_exporter.py", line 315, in convert_operation
graph=graph, var=var, config=config, related_op=op, meta=meta)
File "/home/wyy/anaconda3/envs/py_37/lib/python3.7/site-packages/ppq/parser/onnxruntime_exporter.py", line 74, in insert_quant_on_variable
scale = convert_any_to_torch_tensor(config.scale.clone(), dtype=torch.float32)
AttributeError: 'NoneType' object has no attribute 'clone'

resize forward有点问题

在ppq/executor/op/torch/default.py的Resize_forwrd中，如果len(values) == 2，那么scales = None, 那么if scales.numel() == 1就会出错了

tf graph支持

你好，请教下，ppq目前主要是针对torch及相应的onnx支持，针对tensorflow model的量化（pb），需要转onnx，或者有什么使用建议吗？

RuntimeError of Shape op during Calibration dataset progress and finetune progress

配置信息：

TARGET_PLATFORM = TargetPlatform.NXP_INT8 # choose your target platform
MODEL_TYPE = NetworkFramework.ONNX # or NetworkFramework.CAFFE
INPUT_LAYOUT = 'chw' # input data layout, chw or hwc
NETWORK_INPUTSHAPE = [16, 1, 40, 61] # input shape of your network
CALIBRATION_BATCHSIZE = 16 # batchsize of calibration dataset
EXECUTING_DEVICE = 'cuda' # 'cuda' or 'cpu'.
REQUIRE_ANALYSE = True
DUMP_RESULT = False

SETTING = UnbelievableUserFriendlyQuantizationSetting(
platform = TARGET_PLATFORM, finetune_steps = 2500,
finetune_lr = 1e-3, calibration = 'percentile',
equalization = True, non_quantable_op = None)
dataloader = DataLoader(
dataset=calibration_dataset,
batch_size=32, shuffle=True)
quantized = quantize(
working_directory=WORKING_DIRECTORY, setting=SETTING,
model_type=MODEL_TYPE, executing_device=EXECUTING_DEVICE,
input_shape=NETWORK_INPUTSHAPE, target_platform=TARGET_PLATFORM,
dataloader=dataloader, calib_steps=250)

问题描述：

在213次迭代时shape算子报上述错误，计算后发现这一次迭代batch size=19, 在dataload迭代器内部打印了下log，发现这一批次finetune确实只送出来了19个样本。后来发现数据集样本数刚好在213次迭代时遍历完一遍。
后面我将finetune step和calib_step都改为100， Calibration数据集样本数调整为32*100个之后就能正常运行。
下面是模型文件：
model.zip

量化后模型，转换为 tensorrt int8 engine，inference 不对齐

Hello,
我在尝试使用 PPQ 量化来得到 Tensorrt Int8 模型，发现模型比较大的时候，QDQ Onnx 模型转 TRT Int8 似乎存在性能问题 (无法对齐)，具体地，我尝试小模型如 mnist 时可以对齐 (1e-7量级误差)，稍大的模型如 resnet50 就存在较大的误差
我不确定是否我的操作存在问题，目前定位问题倾向于认为是 Tensorrt 转换过程引入了误差，所以我在 TensorRT repo 中提了 Issue，详见 NVIDIA/TensorRT#2103
想请教一下是否遇到过类似的问题，谢谢！

snpe-量化问题

请问是不是只要onnx转换成功，无论原始模型是caffe还是pytorch，都可以使用ppq正常进行量化？

slice算子无法支持 opset-9 及以下的版本。

https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Slice-1

https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Slice-10

关于量化算子的inference

我想知道量化后的算子的前向过程是怎么计算的？想知道这块的代码是在哪里可以看到？比如uint8 的conv forward？ppq/executor/op/torch/default.py 这个脚本里面conv_forward()没看到相应的实现。

PPQ can not complie cuda extensions, please check your compiler and system environment, PPQ will disable CUDA KERNEL for now.

RTX2080Ti
Python 3.8.13
ninja 1.5.1
ppq 0.6.4
PyTorch 1.12.0
tensorrt 8.4.1.5
export PATH=/usr/local/cuda-11.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64:$LD_LIBRARY_PATH

When import ppq, it raised this prompt message. Could you please give some kind advice? @zchrissirhcz @ouonline

Bug Report

切到 08dc0f8b10ecc8f41e52d7a0d4e7b5dc89a92f66 会报错。

2022-06-05 18:02:30,982 - mmdeploy - ERROR - name 'NCNNRequantizePass' is not defined
2022-06-05 18:02:30,982 - mmdeploy - ERROR - onnx2ncnn_quant_table failed.

切 54c0e3f6f7f469a1a184f54c8c565d93777c6e74 没事。多加点 CI 吧。

ppq目前是否支持动态输入的onnx

The time of model inference increases after doing int8 quantization

my device is i7-8750H

Start Benchmark with openvino (Batchsize = 1)
Time span (FP32 MODE): 68.0568 sec
Time span (INT8 MODE): 85.6443 sec

i don't konw what is happend, how can i get faster after doing int8 quantization in openvino ?

here is the download links of my onnx file:

链接: https://pan.baidu.com/s/1QUhs5wY1fsOVlzsCbsx7aw 提取码: tna9 复制这段内容后打开百度网盘手机App，操作更方便哦

链接: https://pan.baidu.com/s/1DHXTRxBGcPXpOAkPqZo0BQ 提取码: tc2j 复制这段内容后打开百度网盘手机App，操作更方便哦

thank you~

关于scheduler/dispatcher.py 125行处的bug

项目很不错！但是我在跑ONNX官网model zoo 的 efficientnet-lite4-11.onnx 模型有报错。报错在scheduler/dispatcher.py 125行。分析了一下原因是这样：

该模型的graph里有这么一个流：···-->Conv-->BN-->Clip-->···。PPQ会默认 fuse ConvBN，但是fuse得到的operation 是 append 到 graph.operations末尾的。
在给Clip绑定platform时，会执行scheduler/dispatcher.py 125行的语句。

综合1、2，也就是说，此时dispatching_table 是没有ConvBN这个operation的信息的，就会导致报错。顺序上的问题，看作者您怎么解决为好

torch executor 中 Resize_forward 的实现在 sizes 未指定而使用 size > 1 的 scales 时疑似存在问题

如下面的 code snippet 所示，当 resize 操作以 scales 而非 sizes 来规定输出大小时。若传入的 scale 只有一个元素则没什么问题，当输入的scales.numel() > 1时，1089 行只取scale[-2]送进torch.nn.functional.interpolate，即另外一个 dimension 的输入 scale_factor 压根没用上，结果导致后续跟一些 concat 类操作时，由于 dimension 不对，很容易扑街。

看 1088 行其实已经先校验了scales.numel() % 2 == 0，猜测实际上这是一个scales[-2:] -> scales[-2]的 typo？

ppq/ppq/executor/op/torch/default.py

Lines 1083 to 1089 in 0fdea7d

    
           if sizes is None or len(sizes) == 0: 
        
               sizes = None 
        
               if scales.numel() == 1: 
        
                   scales = scales.item() 
        
               else: 
        
                   assert scales.numel() % 2 == 0 
        
                   scales = scales[-2].cpu().numpy().tolist()

ppq对于国产gpu平台应该选择那个TargetPlatform

支持量化训练模型转MNN部署吗

使用MNN进行QAT训练，模型不收敛。请问你们的框架能支持量化训练转MNN吗？

ValueError: Input at [1] of Operation [ScatterND_270] deploy with incorrect device cuda:0

Compling CUDA Kernels. Please wait...
Traceback (most recent call last):
File "C:\Users\admin\AppData\Roaming\Python\Python36\site-packages\ppq\executor\torch.py", line 366, in __forward
outputs = operation_forward_func(operation, inputs, self._executing_context)
File "C:\Users\admin\AppData\Roaming\Python\Python36\site-packages\ppq\executor\op\torch\default.py", line 1421, in ScatterND_forward
ASSERT_ALL_TENSORS_AT_CPU(op=op, values=[None, values[1], None])
File "C:\Users\admin\AppData\Roaming\Python\Python36\site-packages\ppq\executor\op\torch\base.py", line 37, in ASSERT_ALL_TENSORS_AT_CPU
f'Input at [{idx}] of Operation [{op.name}] deploy with incorrect device {tensor.device}, '
ValueError: Input at [1] of Operation [ScatterND_270] deploy with incorrect device cuda:0, which is not supposed to happen in PPQ execution system. This is a critical system failure, you can set ppq.core.config.force_convert as True to force convert those values, which might be able to continue executing your graph. YOU ARE RECOMMEND TO REPORT THIS FAILURE TO US.

如何在python环境中查看ppq版本，import ppq , ppq.version 不可用？

如题

how to load onnx/json into executor??

how to load quantized onnx and json into TorchExecutor?

兼容flake8代码风格

AssertionError: Torch not compiled with CUDA enabled

为什么在CPU本地跑ppq，会报CUDA相关编译错误？如何解决？

Seems that Squeeze node of ONNX input model must have axes attribute, is it a bug or just feature?

I am using an ONNX model which contains a Squeeze operator without the axes attribute as quantize_onnx_model's input. It failed, the error messages are:

Squeeze_126(TargetPlatform.FP32) - inputs:['501'], outputs:['502']
Traceback (most recent call last):
  File "/Users/wusongchao/code/ppq/ppq/executor/torch.py", line 366, in __forward
    outputs = operation_forward_func(operation, inputs, self._executing_context)
  File "/Users/wusongchao/code/ppq/ppq/executor/op/torch/default.py", line 683, in Squeeze_forward
    [squeezing_tensor], axes = values, GET_ATTRIBUTE_FROM_OPERATION(op=op, attribute='axes', compulsive=True)
  File "/Users/wusongchao/code/ppq/ppq/executor/op/torch/base.py", line 78, in GET_ATTRIBUTE_FROM_OPERATION
    'However this value is missing from currecnt operation.')
KeyError: ('Operation Squeeze_126 is supposed to have a value of attribute axes. ', 'However this value is missing from currecnt operation.')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ppq-entrance.py", line 67, in <module>
    device=DEVICE, verbose=0)
  File "/Users/wusongchao/code/ppq/ppq/core/defs.py", line 54, in _wrapper
    return func(*args, **kwargs)
  File "/Users/wusongchao/code/ppq/ppq/api/interface.py", line 274, in quantize_onnx_model
    collate_fn=collate_fn
  File "/Users/wusongchao/code/ppq/ppq/core/defs.py", line 54, in _wrapper
    return func(*args, **kwargs)
  File "/Users/wusongchao/code/ppq/ppq/quantization/quantizer/base.py", line 61, in quantize
    executor.tracing_operation_meta(inputs=inputs)
  File "/Users/wusongchao/.pyenv/versions/3.7.11/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/Users/wusongchao/code/ppq/ppq/core/defs.py", line 54, in _wrapper
    return func(*args, **kwargs)
  File "/Users/wusongchao/code/ppq/ppq/executor/torch.py", line 433, in tracing_operation_meta
    hooks=hooks)
  File "/Users/wusongchao/code/ppq/ppq/executor/torch.py", line 394, in __forward
    raise RuntimeError(f'Error happens when dealing with operation {str(operation)}')
RuntimeError: Error happens when dealing with operation Squeeze_126(TargetPlatform.FP32) - inputs:['501'], outputs:['502']

So i was guided to the definition of Squeeze_forward, the documentation here claims that axes is a optional field(which is the same as ONNX IR doc).

ppq/ppq/executor/op/torch/default.py

Lines 655 to 679 in f1cdb6d

    
           def Squeeze_forward(op: Operation, values: List[torch.Tensor], ctx: TorchBackendContext = None, **kwargs) -> torch.Tensor: 
        
               """Remove single-dimensional entries from the shape of a tensor. Takes an 
        
               input axes with a list of axes to squeeze. If axes is not provided, all the 
        
               single dimensions will be removed from the shape. If an axis is selected 
        
               with shape entry not equal to one, an error is raised. 
        
               Inputs (1 - 2) 
        
                   data (differentiable) : T 
        
                   Tensors with at least max(dims) dimensions. 
        
                   axes (optional, non-differentiable) : tensor(int64) 
        
                   List of integers indicating the dimensions to squeeze. 
        
                   Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(data). 
        
               Outputs 
        
                   squeezed (differentiable) : T 
        
                   Reshaped tensor with same data as input. 
        
               Args: 
        
                   op (Operation): [description] 
        
                   input_values (List[torch.Tensor]): [description] 
        
               Returns: 
        
                   torch.Tensor: [description] 
        
               """

However, the implementation call GET_ATTRIBUTE_FROM_OPERATION with compulsive=True. Since the Squeeze operator inside my model do not contains the axes attribute, it throws the exception that i pasted in the beginning.

ppq/ppq/executor/op/torch/default.py

Lines 680 to 690 in f1cdb6d

    
           ASSERT_ALL_TENSORS_AT_SAME_DEVICE(op=op, values=values) 
        
           ASSERT_NUM_OF_INPUT(op=op, values=values, min_num_of_input=1, max_num_of_input=2) 
        
           [squeezing_tensor], axes = values, GET_ATTRIBUTE_FROM_OPERATION(op=op, attribute='axes', compulsive=True) 
        
           if isinstance(axes, list): 
        
               for squeezing_dim in sorted(axes, reverse=True): 
        
                   squeezing_tensor = torch.squeeze(squeezing_tensor, squeezing_dim) 
        
           elif isinstance(axes, int): 
        
               squeezing_tensor = torch.squeeze(squeezing_tensor, axes) 
        
           else: raise TypeError(f'Parameter axes of operation {op.name} misunderstood, ' 
        
                                 f'expect int value of list of int, while {type(axes)} was given.') 
        
           return squeezing_tensor

So i wonder, is such mandatory needed of Squeeze operator axes field a intended feature, or just a bug?

NotImplementError: Graph op LSTM has no backend implementation on target platform TargetPlatform.UNSPECIFIED

emmmm，我又又来了...（不要嫌弃我hhh）有个NotImplementError，似乎是暂时还不支持LSTM算子吗？

使用CPU执行时报错

我又来了。我尝试在CPU跑ONNX官网model zoo 的 efficientnet-lite4-11.onnx 模型有报错。calibration策略为kl、mse时，quantization/optim/refine.py #582行触发assert，说某算子没有被正确quantize。
我用minmax策略的时候就不会出现这个问题。上述都是在CPU条件下进行的（我这边条件没有GPUhhhh），，

我能通过改动某些代码来解决这个报错吗，还是说我只能先在CPU条件下用minmax策略勒

	if sizes is None or len(sizes) == 0:
	sizes = None
	if scales.numel() == 1:
	scales = scales.item()
	else:
	assert scales.numel() % 2 == 0
	scales = scales[-2].cpu().numpy().tolist()

	def Squeeze_forward(op: Operation, values: List[torch.Tensor], ctx: TorchBackendContext = None, **kwargs) -> torch.Tensor:
	"""Remove single-dimensional entries from the shape of a tensor. Takes an
	input axes with a list of axes to squeeze. If axes is not provided, all the
	single dimensions will be removed from the shape. If an axis is selected
	with shape entry not equal to one, an error is raised.

	Inputs (1 - 2)
	data (differentiable) : T
	Tensors with at least max(dims) dimensions.

	axes (optional, non-differentiable) : tensor(int64)
	List of integers indicating the dimensions to squeeze.
	Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(data).

	Outputs
	squeezed (differentiable) : T
	Reshaped tensor with same data as input.

	Args:
	op (Operation): [description]
	input_values (List[torch.Tensor]): [description]

	Returns:
	torch.Tensor: [description]
	"""

	ASSERT_ALL_TENSORS_AT_SAME_DEVICE(op=op, values=values)
	ASSERT_NUM_OF_INPUT(op=op, values=values, min_num_of_input=1, max_num_of_input=2)
	[squeezing_tensor], axes = values, GET_ATTRIBUTE_FROM_OPERATION(op=op, attribute='axes', compulsive=True)
	if isinstance(axes, list):
	for squeezing_dim in sorted(axes, reverse=True):
	squeezing_tensor = torch.squeeze(squeezing_tensor, squeezing_dim)
	elif isinstance(axes, int):
	squeezing_tensor = torch.squeeze(squeezing_tensor, axes)
	else: raise TypeError(f'Parameter axes of operation {op.name} misunderstood, '
	f'expect int value of list of int, while {type(axes)} was given.')
	return squeezing_tensor

openppl-public / ppq Goto Github PK

ppq's People

Contributors

Stargazers

Watchers

Forkers

ppq's Issues

当我在导出的时候platform=TargetPlatform.ONNXRUNTIME会出现以下报错，但是platform=TargetPlatform.ONNX就不会

配置信息：

问题描述：

Recommend Projects

Recommend Topics

Recommend Org