Comments (7)
Hi @RanchiZhao yes, please see #36, which has a benchmark script and the subclasses. It would be a good idea to add a beginner tutorial as well.
from ao.
It should work for LLMs but the speedup characteristics depend on the matmul shapes and depending on what you are trying to do you will see more / less speedups. You will also need the model to be torch.compile
traceable, as we use the torchao quantization workflow.
from ao.
Do you have some reference papers for PEFT + sparsity? I am also interested in that space as well, but have not been following actively.
It's impossible to say for sure without knowing the exact approach, but theoretically I believe some version of this should be possible, although accuracy is likely prohibitive. In terms of implementation though, this is not something directly supported by our APIs. You may be able to hack something together but we do not plan to add this functionality ATM. We may consider it down the line, so for anyone reading who's interested please react / +1 this comment.
from ao.
Hi @jcaip Thanks, I saw this before but didn't get a chance to look it over carefully, i'll do it now
from ao.
oh, another thing, is this method available in LLM like LLaMA?
And I wanna do this with Hugging Face's transformers, maybe tough to do.
from ao.
thanks a lot, another interesting thing is that, i want do PEFT(like LoRA) on the sparse&quant model.
once we get the trained LoRA modules, we can add them into original models(the bf16 one), and do sparse&quant on it again.
now we get a "sparse&quant aware training" model, we can use it to do inference.
Is this possible?
we should make sure:
- we can put LoRA on the sparse&quant model
- merging LoRA modules into the bf16 model(instead of int8 one because of the dtype conflict) works
from ao.
no, AFAIK no, I' ll keep finding
from ao.
Related Issues (20)
- NF4Tensor uses 8 bits of memory HOT 7
- Doc build failing on main
- [BUG] No module named 'expecttest' when import `torchao`
- FloatQuantization subclass HOT 3
- Building torchao from source installs unnecessary torch and nvidia packages every time HOT 1
- [Question] MBU in automated CI? HOT 2
- [Tracker] WIP features for torchao 0.3 HOT 3
- HQQ Tracker HOT 1
- torchao init: ImportError: libcudart.so.12: cannot open shared object file: No such file or directory HOT 1
- Error when using to_nf4 function, inside NF4Tensor Class HOT 2
- Bitnet 1.58 prework, POC, and staging HOT 2
- Generic packing algorithms from size N to M HOT 4
- torchao.utils.benchmark_model support cpu and mps benchmarking
- custom cuda extensions make installing ao hard HOT 4
- `dequantize_affine` modified the `input` in-place HOT 7
- Numerics checks between NF4 and bnb nf4
- torch.iinfo() support for sub byte dtypes
- Saving autoquant quantization plan HOT 1
- Improvement ideas for `hf_eval.py`
- ARM builds in CI
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ao.