Coder Social home page Coder Social logo

gqa-lut's Introduction

Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers

PyTorch implementation of paper "Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers". It includes code and pretrained jsons for non-linear operations in quantization models.

Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers
Pingcheng Dong, Yonghao Tan, Dong Zhang, Tianwei Ni, Xuejiao Liu, Yu Liu, Peng Luo, Luhong Liang, Shih-Yang Liu, Xijie Huang, Huaiyu Zhu, Yun Pan, Fengwei An, Kwang-Ting Cheng
DAC 2024

Demo

Installation

Clone this repo with submodules:

git clone https://github.com/PingchengDong/GQA-LUT
cd GQA-LUT/

The code is tested with Python3.7, PyTorch == 1.5. We recommend you to use anaconda to make sure that all dependencies are in place. To create an anaconda environment:

conda env create -f environment.yml
conda activate gqa-lut

Support List

├──Non-linear operations
    ├──GELU
    ├──HSwish
    ├──Sigmoid
    ├──Exponent
    ├──Reciprocal
    ├──Reciprocal of square root
    ├──...

Approximation

Example: to approximate GELU with 8 segpoints:

python gqa-lut.py --act_func 'gelu' --x_range -4 4 --sp_range -4.0 4.0 --num_splits 7 --decimal_bit_range 0 6 --total_iters 500 --mutate

We provide some pretrained jsons for several non-linear operations with 8 & 16 segpoints, which are mostly used in neural network in the pretrained folder.

To assist you in reproducing our results as accurately as possible, we provide a Makefile file. It includes the parameter settings and execution methods for several supported non-linear functions in the GQA-LUT code mentioned above.

For example, for GQA-LUT approximation of GELU function with 8 segpoints, running:

make gelu_8

GQA Hardware Python Model

We also provide a basic pytorch style python model for hardware simulation for those who are interested in applying the GQA in real implementation. The code gqa_hw_model.py will help you to simulate the hardware processing of the non-linear function approximation with the LUT style. The input and output will be in INT8 format, and the intermediate results are all restricted with the operation precision. You may change the simulation code to meet the situation of your hardware implementation.

Citation

@inproceedings{dong2024gqalut,
  author    = author={Dong, Pingcheng and Tan, Yonghao and Zhang, Dong and Ni, Tianwei and Liu, Xuejiao and Liu, Yu and Luo, Peng and Liang, Luhong and Liu, Shih-Yang and Huang, Xijie and Zhu, Huaiyu and Pan, Yun and An, Fengwei and Cheng, Kwang-Ting},
  title     = {Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers},
  booktitle = {Design Automation Conference (DAC)},
  year      = {2024}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.