Coder Social home page Coder Social logo

mazzzystar / aimet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from quic/aimet

0.0 2.0 0.0 3.67 MB

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Home Page: https://quic.github.io/aimet-pages/index.html

License: Other

Shell 1.01% CMake 1.76% Dockerfile 0.25% Python 81.38% XSLT 0.21% C++ 14.76% Cuda 0.63%

aimet's Introduction

Qualcomm Innovation Center, Inc. AIMET on GitHub Pages Discussion Forums

AI Model Efficiency Toolkit (AIMET)

AIMET is a library that provides advanced model quantization and compression techniques for trained neural network models. It provides features that have been proven to improve run-time performance of deep learning neural network models with lower compute and memory requirements and minimal impact to task accuracy.

How AIMET works

AIMET is designed to work with PyTorch and TensorFlow models.

Table of Contents

Why AIMET?

Benefits of AIMET

  • Supports advanced quantization techniques: Inference using integer runtimes is significantly faster than using floating-point runtimes. For example, models run 5x-15x faster on the Qualcomm Hexagon DSP than on the Qualcomm Kyro CPU. In addition, 8-bit precision models have a 4x smaller footprint than 32-bit precision models. However, maintaining model accuracy when quantizing ML models is often challenging. AIMET solves this using novel techniques like Data-Free Quantization that provide state-of-the-art INT8 results on several popular models.
  • Supports advanced model compression techniques that enable models to run faster at inference-time and require less memory
  • AIMET is designed to automate optimization of neural networks avoiding time-consuming and tedious manual tweaking. AIMET also provides user-friendly APIs that allow users to make calls directly from their TensorFlow or PyTorch pipelines.

Please visit the AIMET on Github Pages for more details.

Supported Features

Quantization

  • Cross-Layer Equalization: Equalize weight tensors to reduce amplitude variation across channels
  • Bias Correction: Corrects shift in layer outputs introduced due to quantization
  • Quantization Simulation: Simulate on-target quantized inference accuracy
  • Fine-tuning: Use quantization simulation to train the model further to improve accuracy

Model Compression

  • Spatial SVD: Tensor decomposition technique to split a large layer into two smaller ones
  • Channel Pruning: Removes redundant input channels from a layer and reconstructs layer weights
  • Per-layer compression-ratio selection: Automatically selects how much to compress each layer in the model

Visualization

  • Weight ranges: Inspect visually if a model is a candidate for applying the Cross Layer Equalization technique. And the effect after applying the technique
  • Per-layer compression sensitivity: Visually get feedback about the sensitivity of any given layer in the model to compression

Results

AIMET can quantize an existing 32-bit floating-point model to an 8-bit fixed-point model without sacrificing much accuracy and without model fine-tuning. As an example of accuracy maintained, the DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9% loss in accuracy all the way down to 8-bit quantization — in an automated way without any training data.

Models FP32 INT8 Simulation
MobileNet v2 (top1) 71.72% 71.08%
ResNet 50 (top1) 76.05% 75.45%
DeepLab v3 (mIOU) 72.65% 71.91%

AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18, compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining accuracy within approx. 1% of the original uncompressed model.

Models Uncompressed model 50% Compressed model
ResNet18 (top1) 69.76% 68.56%
ResNet 50 (top1) 76.05% 75.75%

Getting Started

Contributions

Thanks for your interest in contributing to AIMET! Please read our Contributions Page for more information on contributing features or bug fixes. We look forward to your participation!

Team

AIMET aims to be a community-driven project maintained by Qualcomm Innovation Center, Inc.

License

AIMET is licensed under the BSD 3-clause “New” or “Revised” License. Check out the LICENSE for more details.

aimet's People

Contributors

quic-akhobare avatar quic-bharathr avatar quic-hitameht avatar quic-klhsieh avatar quic-mangal avatar quic-sendilk avatar quic-ssiddego avatar quic-sundarr avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.