Coder Social home page Coder Social logo

torch-int's Introduction

torch-int

This repository contains integer operators on GPUs for PyTorch.

Dependencies

  • CUTLASS
  • PyTorch with CUDA 11.3
  • NVIDIA-Toolkit 11.3
  • CUDA Driver 11.3
  • gcc g++ 9.4.0
  • cmake >= 3.12

Installation

git clone --recurse-submodules https://github.com/Guangxuan-Xiao/torch-int.git
conda create -n int python=3.8
conda activate int
conda install -c anaconda gxx_linux-64=9
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt
source environment.sh
bash build_cutlass.sh
python setup.py install

Test

python tests/test_linear_modules.py

torch-int's People

Contributors

guangxuan-xiao avatar merrymercy avatar mickaelseznec avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

torch-int's Issues

ModuleNotFoundError: No module named 'torch_int._CUDA'

Error happens as below:

>>> import torch_int
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/xxx/jiang/torch-int/torch_int/__init__.py", line 1, in <module>
    from . import nn
  File "/home/xxx/jiang/torch-int/torch_int/nn/__init__.py", line 1, in <module>
    from .linear import W8A16Linear, W8FakeA8Linear
  File "/home/xxx/jiang/torch-int/torch_int/nn/linear.py", line 2, in <module>
    from .._CUDA import (linear_a8_w8_b32_o32,
ModuleNotFoundError: No module named 'torch_int._CUDA'

Where is the _CUDA module?

is it possible to install torch-int on CUDA version 12.3

My cuda version is 12.3 and it is nontrivial for me get to 11.3. When I run python setup.py install I get RuntimeError:
The detected CUDA version (12.3) mismatches the version that was used to compile
PyTorch (11.3). Please make sure to use the same CUDA versions.

Is it possible to install it on 12.3 CUDA?

fatal error: crypt.h: No such file or directory

python setup.py install
running install
/opt/conda/envs/smoothquant/lib/python3.8/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

    ********************************************************************************
    Please avoid running ``setup.py`` directly.
    Instead, use pypa/build, pypa/installer or other
    standards-based tools.

    See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
    ********************************************************************************

!!
self.initialize_options()
/opt/conda/envs/smoothquant/lib/python3.8/site-packages/setuptools/_distutils/cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

    ********************************************************************************
    Please avoid running ``setup.py`` and ``easy_install``.
    Instead, use pypa/build, pypa/installer or other
    standards-based tools.

    See https://github.com/pypa/setuptools/issues/917 for details.
    ********************************************************************************

!!
self.initialize_options()
running bdist_egg
running egg_info
writing torch_int.egg-info/PKG-INFO
writing dependency_links to torch_int.egg-info/dependency_links.txt
writing top-level names to torch_int.egg-info/top_level.txt
reading manifest file 'torch_int.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'torch_int.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying torch_int/init.py -> build/lib.linux-x86_64-cpython-38/torch_int
copying torch_int/nn/fused.py -> build/lib.linux-x86_64-cpython-38/torch_int/nn
copying torch_int/nn/linear.py -> build/lib.linux-x86_64-cpython-38/torch_int/nn
copying torch_int/nn/bmm.py -> build/lib.linux-x86_64-cpython-38/torch_int/nn
copying torch_int/nn/init.py -> build/lib.linux-x86_64-cpython-38/torch_int/nn
copying torch_int/utils/init.py -> build/lib.linux-x86_64-cpython-38/torch_int/utils
copying torch_int/functional/fused.py -> build/lib.linux-x86_64-cpython-38/torch_int/functional
copying torch_int/functional/bmm.py -> build/lib.linux-x86_64-cpython-38/torch_int/functional
copying torch_int/functional/quantization.py -> build/lib.linux-x86_64-cpython-38/torch_int/functional
copying torch_int/functional/init.py -> build/lib.linux-x86_64-cpython-38/torch_int/functional
copying torch_int/models/opt.py -> build/lib.linux-x86_64-cpython-38/torch_int/models
copying torch_int/models/init.py -> build/lib.linux-x86_64-cpython-38/torch_int/models
running build_ext
/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/utils/cpp_extension.py:813: UserWarning: The detected CUDA version (11.8) has a minor version mismatch with the version that was used to compile PyTorch (11.3). Most likely this shouldn't be a problem.
warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/utils/cpp_extension.py:820: UserWarning: There are no /opt/conda/envs/smoothquant/bin/x86_64-conda-linux-gnu-c++ version bounds defined for CUDA version 11.8
warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'torch_int.CUDA' extension
/opt/conda/envs/smoothquant/bin/x86_64-conda-linux-gnu-cc -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /opt/conda/envs/smoothquant/include -fPIC -O2 -isystem /opt/conda/envs/smoothquant/include -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /opt/conda/envs/smoothquant/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /opt/conda/envs/smoothquant/include -fPIC -Itorch_int/kernels/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-11.8/include -I/opt/conda/envs/smoothquant/include/python3.8 -c torch_int/kernels/bindings.cpp -o build/temp.linux-x86_64-cpython-38/torch_int/kernels/bindings.o -std=c++14 -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -DTORCH_EXTENSION_NAME=CUDA -D_GLIBCXX_USE_CXX11_ABI=0
/usr/local/cuda-11.8/bin/nvcc -Itorch_int/kernels/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-11.8/include -I/opt/conda/envs/smoothquant/include/python3.8 -c torch_int/kernels/bmm.cu -o build/temp.linux-x86_64-cpython-38/torch_int/kernels/bmm.o -D__CUDA_NO_HALF_OPERATORS
-D__CUDA_NO_HALF_CONVERSIONS
-D__CUDA_NO_BFLOAT16_CONVERSIONS
_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DCUDA_ARCH=800 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=_CUDA -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -ccbin /opt/conda/envs/smoothquant/bin/x86_64-conda-linux-gnu-cc
/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/c10/core/SymInt.h(84): warning #68-D: integer conversion resulted in a change of sign

/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/c10/core/SymInt.h(84): warning #68-D: integer conversion resulted in a change of sign

/usr/local/cuda-11.8/bin/nvcc -Itorch_int/kernels/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/TH -I/opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-11.8/include -I/opt/conda/envs/smoothquant/include/python3.8 -c torch_int/kernels/fused.cu -o build/temp.linux-x86_64-cpython-38/torch_int/kernels/fused.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DCUDA_ARCH=800 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=_CUDA -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 -ccbin /opt/conda/envs/smoothquant/bin/x86_64-conda-linux-gnu-cc
In file included from /opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:10,
from /opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/torch/csrc/Device.h:3,
from /opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
from /opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/torch/extension.h:6,
from /opt/conda/envs/smoothquant/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/torch.h:6,
from torch_int/kernels/include/common.h:8,
from torch_int/kernels/fused.cu:2:
/opt/conda/envs/smoothquant/include/python3.8/Python.h:44:10: fatal error: crypt.h: No such file or directory
44 | #include <crypt.h>
| ^~~~~~~~~
compilation terminated.
error: command '/usr/local/cuda-11.8/bin/nvcc' failed with exit code 1

Tests `test_linear_shape.py` fails

Thanks for this wonderful repo - it's a pleasure to work with it.

While playing around with the repo, I realized that the Linear layer accomodates only some input sizes. In fact, when running the test_linear_shape.py test, I get the following output:

test_quant_linear_a8_w8_bfp32_ofp32
Traceback (most recent call last):
  File "/notebooks/torch-int/tests/test_linear_shape.py", line 42, in <module>
    test_quant_linear_a8_w8_bfp32_ofp32()
  File "/usr/local/lib/python3.9/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/notebooks/torch-int/tests/test_linear_shape.py", line 18, in test_quant_linear_a8_w8_bfp32_ofp32
    y = linear_a8_w8_bfp32_ofp32(
RuntimeError: cutlass cannot implement

Is there some shape restriction on B, N, M that need to be satisfied for working with torch-int?

your environment can never be created

Followed the instruction while always encounter errors such as
/opt/conda/envs/int/compiler_compat/ld: cannot find -lcublas_static: No such file or directory

Even by using nvidia docker image, the error is still there.

Undefined symbol when running test_linear_modules.py

I followed all instructions, but when I ran test_linear_modules.py, the following error happen

ImportError: /home/ubuntu/anaconda3/envs/int/lib/python3.8/site-packages/torch_int/_CUDA.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail23torchInternalAssertFailEPKcS2_jS2_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

How can I fix it?

what is ellm.tools.quantize_int?

I really enjoy read this repository

i wonder that

/benchmark/bench_model.py
In line 3, from ellm.tools.quantize_int import quantize_model_int

what is ellm.tools.quantize_int?

Could not find compiler set in environment variable CUDACXX

When I run "bash build_cutlass.sh", an error happens:

-- CMake Version: 3.18.2
-- The CXX compiler identification is GNU 8.4.1
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at /usr/share/cmake/Modules/CMakeDetermineCUDACompiler.cmake:25 (message):
Could not find compiler set in environment variable CUDACXX:

/usr/local/cuda/bin/nvcc.

Call Stack (most recent call first):
CUDA.cmake:46 (enable_language)
CMakeLists.txt:42 (include)

CMake Error: CMAKE_CUDA_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
See also "/home/lizhangming/Project/torch-int-main/submodules/cutlass/build/CMakeFiles/CMakeOutput.log".
make: *** No targets specified and no makefile found. Stop.

what should I do before "bash build_cutlass.sh"?

Cannot read from remote repo when cloning './torch-int/submodules/cutlass'

The error message is as follows:

Submodule 'submodules/cutlass' ([email protected]:NVIDIA/cutlass.git) registered for path 'submodules/cutlass'
Cloning into '/opt/conda/bin/torch-int/submodules/cutlass'...
kex_exchange_identification: Connection closed by remote host
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:NVIDIA/cutlass.git' into submodule path '/opt/conda/bin/torch-int/submodules/cutlass' failed
Failed to clone 'submodules/cutlass'. Retry scheduled
Cloning into '/opt/conda/bin/torch-int/submodules/cutlass'...
kex_exchange_identification: Connection closed by remote host
fatal: Could not read from remote repository.

Why can't I clone from the Nv repo? If it's the name or path has changed?

error: "auto" is not allowed here

编译时候总是遇到这类问题

type_traits.hpp(43): error: namespace "std" has no member "conjunction"
/torch-int/submodules/cutlass/include/cute/util/type_traits.hpp(44): error: namespace "std" has no member "conjunction_v"
/torch-int/submodules/cutlass/include/cute/util/type_traits.hpp(46): error: namespace "std" has no member "disjunction"
/torch-int/submodules/cutlass/include/cute/util/type_traits.hpp(47): error: namespace "std" has no member "disjunction_v"
/torch-int/submodules/cutlass/include/cute/util/type_traits.hpp(49): error: namespace "std" has no member "negation"
/torch-int/submodules/cutlass/include/cute/util/type_traits.hpp(50): error: namespace "std" has no member "negation_v"
/torch-int/submodules/cutlass/include/cute/util/type_traits.hpp(52): error: namespace "std" has no member "void_t"
/torch-int/submodules/cutlass/include/cute/numeric/integral_constant.hpp(78): error: "auto" is not allowed here
/torch-int/submodules/cutlass/include/cute/numeric/integral_constant.hpp(80): error: "auto" is not allowed here
/torch-int/submodules/cutlass/include/cute/numeric/integral_constant.hpp(82): error: "auto" is not allowed here
/torch-int/submodules/cutlass/include/cute/numeric/integral_constant.hpp(84): error: "auto" is not allowed here
/torch-int/submodules/cutlass/include/cute/numeric/integral_constant.hpp(86): error: "auto" is not allowed here
/torch-int/submodules/cutlass/include/cute/numeric/integral_constant.hpp(88): error: "auto" is not allowed here
/torch-int/submodules/cutlass/include/cute/underscore.hpp(67): error: disjunction is not a template
/torch-int/submodules/cutlass/include/cute/underscore.hpp(79): error: conjunction is not a template
/torch-int/submodules/cutlass/include/cutlass/gemm/gemm.h(562): error: namespace "std" has no member "void_t"
/torch-int/submodules/cutlass/include/cutlass/gemm/gemm.h(562): error: expected a ">"
/torch-int/submodules/cutlass/include/cutlass/gemm/gemm.h(562): error: expected a ";"

1)需要将setup.py中c++14 改成c++17

can not install by setup.py because /ld: cannot find -lcublas_static

Hi,
I think this repo is great and tried it using pre-baked docker image for nvidi, and after following the instructions in Readme.md, I got these errors.
/root/anaconda3/envs/int/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find -lcublas_static
/root/anaconda3/envs/int/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find -lcublasLt_static

I guess maybe the cublass location is different, but I can not figure out how to make it right. Would you mind helping me out? Thx!

I also can not find cublasLt_static or cublas_static by using find. The results were:
(int) root@bafd706ba0bf:/workspace/torch-int# find / -name "cublasLt" -print
/root/anaconda3/lib/python3.10/site-packages/torch/lib/libcublasLt.so.11
/root/anaconda3/envs/int/lib/python3.8/site-packages/torch/lib/libcublasLt.so.11
/usr/local/cuda-11.3/targets/x86_64-linux/lib/stubs/libcublasLt.so
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcublasLt.so
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcublasLt.so.11
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcublasLt.so.11.5.1.109
/usr/local/cuda-11.3/targets/x86_64-linux/include/cublasLt.h

BTW, if I link cublas without static it can finish the compilation. But is will fail the test on readme. python tests/test_linear_modules.py

编译报错,是什么原因啊

/usr/bin/ld: warning: /home/xxx/anaconda3090/envs/zach_glm/lib/libstdc++.so: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ld: warning: /home/xxx/anaconda3090/envs/zach_glm/lib/libstdc++.so: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ld: warning: /home/xxx/anaconda3090/envs/zach_glm/lib/libgcc_s.so.1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ld: warning: /home/xxx/anaconda3090/envs/zach_glm/lib/libgcc_s.so.1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ld: warning: /home/xxx/anaconda3090/envs/zach_glm/lib/libgcc_s.so.1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ld: warning: /home/xx/anaconda3090/envs/zach_glm/lib/libgcc_s.so.1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.