Coder Social home page Coder Social logo

ziplab / qllm Goto Github PK

View Code? Open in Web Editor NEW
13.0 2.0 0.0 1.72 MB

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

Home Page: https://arxiv.org/abs/2310.08041

License: Apache License 2.0

Python 98.93% Shell 1.07%

qllm's Introduction

QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models (ICLR 2024)

License arXiv

This is the official PyTorch implementation of QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models.

By Jing Liu, Ruihao Gong, Xiuying Wei, Zhiwei Dong, Jianfei Cai, and Bohan Zhuang.

qllm

We propose QLLM, an accurate and efficient low-bitwidth post-training quantization method designed for LLMs.

๐Ÿ“ฐ News

  • [10-03-2024] Release the code!๐ŸŒŸ
  • [17-01-2024] QLLM is accepted by ICLR 2024! ๐Ÿ‘

๐Ÿ“– Contents

๐Ÿ›  Install

conda create -n qllm python=3.10 -y
conda activate qllm
git clone https://github.com/ModelTC/QLLM
cd QLLM
pip install --upgrade pip 
pip install -e .

โš™๏ธ Usage

We provide the training scripts in scripts folder. For example, to perform W4A8 quantization for LLaMA-7B, run

sh scripts/llama-7b/w4a4.sh

Remember to change the path of model model and output path output_dir.

๐Ÿ“‹ Results

  • QLLM achieve SoTA performance in weight-activation quantization

weight_activation_llama_1 weight_activation_llama_2

๐Ÿ“ Citation

If you find our QLLM useful in your research, please consider to cite the following related papers:

@inproceedings{liu2024qllm,
  title = {{QLLM}: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models},
  author = {Liu, Jing and Gong, Ruihao and Wei, Xiuying and Dong, Zhiwei and Cai, Jianfei and Zhuang, Bohan},
  booktitle = {International Conference on Learning Representations (ICLR)},
  year = {2024},
}

๐Ÿงพ License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

๐Ÿ™ Acknowledgement

This repository is built upon OmniQuant. We thank the authors for their open-sourced code.

qllm's People

Contributors

liujingcs avatar xhplus avatar

Stargazers

 avatar BaofengZan avatar Peyton avatar SifanZhou avatar Ninnart Fuengfusin avatar Zhuang Zhuang avatar skykiseki avatar OOOOQII avatar  avatar Jinyu Bai avatar Aflah avatar hoshi-hiyouga avatar  avatar

Watchers

Zizheng Pan avatar Kostas Georgiou avatar

qllm's Issues

QLLM need to assemble and disassemble in inference?

I noticed the QLLM have 4% additional cost in inference throughput compared with normal W4A4.
Is that mean we need to assemble and disassemble the parameters in inference? If not,why it has 4% additional cost?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.