nnieqat-pytorch

Nnieqat is a quantize aware training package for Neural Network Inference Engine(NNIE) on pytorch, it uses hisilicon quantization library to quantize module's weight and activation as fake fp32 format.

Installation
Usage
Code Examples
Results
Todo
Reference

Installation

Supported Platforms: Linux
Accelerators and GPUs: NVIDIA GPUs via CUDA driver 10.1 or 10.2.
Dependencies:
- python >= 3.5, < 4
- llvmlite >= 0.31.0
- pytorch >= 1.5
- numba >= 0.42.0
- numpy >= 1.18.1
Install nnieqat via pypi:
```
$ pip install nnieqat
```
Install nnieqat in docker(easy way to solve environment problems)：
```
$ cd docker
$ docker build -t nnieqat-image .
```

Install nnieqat via repo：

$ git clone https://github.com/aovoc/nnieqat-pytorch
$ cd nnieqat-pytorch
$ make install

Usage

add quantization hook.

quantize and dequantize weight and data with HiSVP GFPQ library in forward() process.

from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
  register_quantization_hook(model)
...

merge bn weight into conv and freeze bn

suggest finetuning from a well-trained model, merge_freeze_bn at beginning. do it after a few epochs of training otherwise.

from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
    model.train()
    model = merge_freeze_bn(model)  #it will change bn to eval() mode during training
...

Unquantize weight before update it

from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
    model.apply(unquant_weight)  # using original weight while updating
    optimizer.step()
...

Dump weight optimized model

from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
    model.apply(quant_dequant_weight)
    save_checkpoint(...)
    model.apply(unquant_weight)
...

Code Examples

Cifar10 quantization aware training example (add nnieqat into pytorch_cifar10_tutorial)

python test/test_cifar10.py
ImageNet quantization finetuning example (add nnieqat into pytorh_imagenet_main.py)

python test/test_imagenet.py --pretrained path_to_imagenet_dataset

Results

ImageNet

python test/test_imagenet.py /data/imgnet/ --arch squeezenet1_1  --lr 0.001 --pretrained --epoch 10   # nnie_lr_e-3_ft
python pytorh_imagenet_main.py /data/imgnet/ --arch squeezenet1_1  --lr 0.0001 --pretrained --epoch 10  # lr_e-4_ft
python test/test_imagenet.py /data/imgnet/ --arch squeezenet1_1  --lr 0.0001 --pretrained --epoch 10  # nnie_lr_e-4_ft

finetune result：

	trt_fp32	trt_int8	nnie
torchvision	0.56992	0.56424	0.56026
nnie_lr_e-3_ft	0.56600	0.56328	0.56612
lr_e-4_ft	0.57884	0.57502	0.57542
nnie_lr_e-4_ft	0.57834	0.57524	0.57730

Todo

Generate quantized model directly.

Reference

HiSVP 量化库使用指南

Quantizing deep convolutional networks for efficient inference: A whitepaper

8-bit Inference with TensorRT

Distilling the Knowledge in a Neural Network

ebugger / nnieqat-pytorch Goto Github PK

nnieqat-pytorch's Introduction

nnieqat-pytorch

Table of Contents

Installation

Usage

Code Examples

Results

Todo

Reference

nnieqat-pytorch's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent