Comments (4)
如果需要静态量化(qat或者ptq)请按照examples/quantization下面的样例代码来(你直接把浮点的模型传入TFLiteConverter是不能直接得到静态量化的模型的),如果是动态量化,参考examples/converter/dynamic.py
from tinyneuralnetwork.
我使用的就是 examples/converter/dynamic.py 中的代码,然后我换了一个分类模型,希望得到 int8 量化后的 tflite。修改了需要传入的 dummy_input,dummy_input = torch.ones((1, 3, 64, 64))
,但是设置 dummy_input 为 int8 或 float32 类型都会报错。
from tinyneuralnetwork.
dynamic.py下面还有一系列参数 你没抄全
以及下面这两个参数是给全量化使用的,动态量化不要加这些参数
quantize_input_output_type = 'int8',
fuse_quant_dequant = True,
from tinyneuralnetwork.
我使用的就是 examples/converter/dynamic.py 中的代码,然后我换了一个分类模型,希望得到 int8 量化后的 tflite。修改了需要传入的 dummy_input,
dummy_input = torch.ones((1, 3, 64, 64))
,但是设置 dummy_input 为 int8 或 float32 类型都会报错。
你这里int8量化如果是指的全量化模型,你要用examples/quantization下的qat.py或者post.py
关于不同量化技术的比较,可以看tflite的介绍
https://www.tensorflow.org/lite/performance/post_training_quantization?hl=zh-cn
简言之动态量化是weight only量化
静态量化是weight activation都一起量化了
全量化比较快,而且适用于npu等各个芯片
或者你可以安装一下netron,跑我们的样例代码,打开生成后的模型,就可以看出两个量化模式有什么区别了
Update: 重新看了你的第一条帖子,你要int8的量化输入输出,那你肯定需要的是全量化,请按照全量化的代码样例来走,谢谢
from tinyneuralnetwork.
Related Issues (20)
- Meet Detailed error: Tensor-likes are not close! using TFLiteConverter HOT 2
- [Converter] Need transpose optimization HOT 2
- Float model failed to convert to TFLite
- [converter] map gather(+reshape) ops with seperate consecutive indices to split(unpack) ops
- tinynn.converter module not found! HOT 2
- [CI] several tests for modifier failed
- Whether to support pytorch to keras HOT 1
- TransposeConv wrong shape? HOT 15
- change input to INT8 after converting to tflite HOT 2
- [converter] implement torch's `aten::scaled_dot_product_attention` operator HOT 2
- Request: clamp would be more efficient to go to Bounded Relu than Maximum + Minimum HOT 3
- Do not support PReLU module? HOT 5
- torch.max not working HOT 2
- OneShotChannelPruner results in the miss of some operators HOT 4
- KeyError when executing quantization HOT 5
- Does tinynn support following int16 quantization? HOT 1
- jit.trace succeed but tinynn tracer failed HOT 1
- It became larger after converting to tflite model HOT 4
- how to do Post-training integer quantization with int16 activation HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tinyneuralnetwork.