Comments (3)
Sorry, the following is the time to load 13B (653s), 30B (1672s) and 65B (4242s). Because I use the weights placed on the mobile hard disk, it may take a little longer. And I found that after loading the two weight files of 13B, and then loading the four weight files of 30B, it will directly jump to the third weight file before starting. The first two weight files are directly loaded, and the memory required for loading 30B is about 96G, The video memory is about 37G. After increasing the swap to 120G, 30B can be loaded without being killed. When loading 65B, the memory needs about 96G+50G (swap), while the video memory needs about 70G
from llama-int8.
That seems likely. Have you tried increasing the size of your swapfile?
from llama-int8.
I'm also having the troubles of insufficient cpu ram as well. Would you mind clarifying how much CPU ram is required when using in8 version for like 13B, 33B and 65B llama? I want to adjust hardware spec plan according to your advice. Thanks!
from llama-int8.
Related Issues (17)
- 13B - load is successful on T4, but forward pass fails
- Tracking issue for Mac support HOT 3
- Does 8GB able to run smallest llama model? HOT 4
- Any chance to share quantized int8 7B and 13B models?
- RTX4090 CUDA out of memory. HOT 3
- Systematic comparison of original models to int8 inferencing HOT 1
- Is it possible to save the smaller weights so it doesn't have to convert them each time?
- Can 65B run on 4*32G GPU?
- Getting error on generation in Windows HOT 4
- Issue for bitsandbytes /// NameError: name 'cuda_setup' is not defined. Did you mean: 'CUDASetup'? HOT 1
- Further detail needed - installing bitsandbytes from source HOT 1
- LLaMA 13B works on a single RTX 4080 16GB HOT 1
- 65B on multiple GPUs : CUDA out of memory with 4 x GPU RTX A5000 (24GB) / 96GB in total HOT 3
- CUDA out of memory
- Producing nan Tensors
- Does this support llama2 as well?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama-int8.