Coder Social home page Coder Social logo

aakashapoorv / ao Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pytorch/ao

0.0 0.0 0.0 1.11 MB

torchao: PyTorch Architecture Optimization (AO). Performant kernels that work with PyTorch.

License: BSD 3-Clause "New" or "Revised" License

Shell 0.06% Python 99.94%

ao's Introduction

torchao: PyTorch Architecture Optimization

Note: This repository is currently under heavy development - if you have suggestions on the API or use-cases you'd like to be covered, please open an github issue

Introduction

torchao is a PyTorch native library for optimizing your models using lower precision dtypes, techniques like quantization and sparsity and performant kernels.

Get Started

To try out our APIs, you can check out API examples in quantization (including autoquant), sparsity, dtypes.

Installation

Note: this library makes liberal use of several new features in pytorch, its recommended to use it with the current nightly or latest stable version of PyTorch.

  1. From PyPI:
pip install torchao
  1. From Source:
git clone https://github.com/pytorch-labs/ao
cd ao
pip install -e .

Key Features

The library provides

  1. Support for lower precision dtypes such as nf4, uint4 that are torch.compile friendly
  2. Quantization algorithms such as dynamic quant, smoothquant, GPTQ that run on CPU/GPU and Mobile.
  • Int8 dynamic activation quantization
  • Int8 and int4 weight-only quantization
  • Int8 dynamic activation quantization with int4 weight quantization
  • GPTQ and Smoothquant
  • High level autoquant API and kernel auto tuner targeting SOTA performance across varying model shapes on consumer/enterprise GPUs.
  1. Sparsity algorithms such as Wanda that help improve accuracy of sparse networks
  2. Integration with other PyTorch native libraries like torchtune and ExecuTorch

Our Goals

torchao embodies PyTorch’s design philosophy details, especially "usability over everything else". Our vision for this repository is the following:

  • Composability: Native solutions for optimization techniques that compose with both torch.compile and FSDP
    • For example, for QLoRA for new dtypes support
  • Interoperability: Work with the rest of the PyTorch ecosystem such as torchtune, gpt-fast and ExecuTorch
  • Transparent Benchmarks: Regularly run performance benchmarking of our APIs across a suite of Torchbench models and across hardware backends
  • Heterogeneous Hardware: Efficient kernels that can run on CPU/GPU based server (w/ torch.compile) and mobile backends (w/ ExecuTorch).
  • Infrastructure Support: Release packaging solution for kernels and a CI/CD setup that runs these kernels on different backends.

Interoperability with PyTorch Libraries

torchao has been integrated with other repositories to ease usage

  • torchtune is integrated with 8 and 4 bit weight-only quantization techniques with and without GPTQ.
  • Executorch is integrated with GPTQ for both 8da4w (int8 dynamic activation, with int4 weight) and int4 weight only quantization.

Success stories

Our kernels have has been used to achieve SOTA inference performance on

  1. Image segmentation models with sam-fast
  2. Language models with gpt-fast
  3. Diffusion models with sd-fast

License

torchao is released under the BSD 3 license.

ao's People

Contributors

andrewor14 avatar cpuhrsch avatar dependabot[bot] avatar drisspg avatar hdcharles avatar jcaip avatar jeromeku avatar jerryzh168 avatar larryliu0820 avatar leslie-fang-intel avatar manuelcandales avatar msaroufim avatar rohan-varma avatar supriyar avatar svekars avatar weifengpy avatar xia-weiwen avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.