Comments (4)
Hello shuther,
Thank you for reporting this issue. I apologize for the inconvenience you're experiencing with the LocalAI container image. Based on the information you've provided, it seems this is a memory allocation issue on your VM.
Could you please provide the output of the following command to help us gather more details about your GPU and system capabilities?
nvidia-smi
This command will give us information about the GPU usage and memory details.
Additionally, please try to run the command you provided but with the following modification:
CUDA_LAUNCH_BLOCKING=0
This will disable CUDA launch blocking, which might help to prevent the GPU memory error. You should run the command like this:
CUDA_LAUNCH_BLOCKING=0 curl http://linuxmain.local:8445/embeddings \
-X POST -H "Content-Type: application/json" \
-d '{"input": "Your text string goes here", "model": "text-embedding-ada-002}"'
I understand that this issue is crucial for your project, and I assure you that we'll do our best to investigate this problem and provide a solution as soon as possible.
Best regards,
The Github Bot (Experiment of @mudler)
from localai.
works fine if I stop the docker and start it again; likely we need a better eviction approach when we switch models?
from localai.
extra logs
nvidia-smi # when I launch the docker (initial load)
Thu Apr 25 11:14:44 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04 Driver Version: 535.171.04 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2060 Off | 00000000:13:00.0 On | N/A |
| 38% 38C P8 16W / 160W | 258MiB / 6144MiB | 22% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2158 G /usr/lib/xorg/Xorg 131MiB |
| 0 N/A N/A 2686 G /usr/bin/gnome-shell 67MiB |
| 0 N/A N/A 3376 G /usr/bin/nextcloud 3MiB |
| 0 N/A N/A 24782 G ...30092458,1701102826035513081,262144 50MiB |
+---------------------------------------------------------------------------------------+
I spotted this error also:
localai-docker-api-1 | 9:15AM INF Trying to load the model '5c7cd056ecf9a4bb5b527410b97f48cb' with all the available backends: llama-cpp, llama-ggml, gpt4all, bert-embeddings, rwkv, whisper, stablediffusion, tinydream, piper, /build/backend/python/vall-e-x/run.sh, /build/backend/python/sentencetransformers/run.sh, /build/backend/python/diffusers/run.sh, /build/backend/python/sentencetransformers/run.sh, /build/backend/python/vllm/run.sh, /build/backend/python/exllama2/run.sh, /build/backend/python/bark/run.sh, /build/backend/python/transformers/run.sh, /build/backend/python/autogptq/run.sh, /build/backend/python/coqui/run.sh, /build/backend/python/mamba/run.sh, /build/backend/python/transformers-musicgen/run.sh, /build/backend/python/petals/run.sh, /build/backend/python/exllama/run.sh
localai-docker-api-1 | 9:15AM INF [llama-cpp] Attempting to load
localai-docker-api-1 | 9:15AM INF Loading model '5c7cd056ecf9a4bb5b527410b97f48cb' with backend llama-cpp
localai-docker-api-1 | 9:15AM DBG Loading model in memory from file: /build/models/5c7cd056ecf9a4bb5b527410b97f48cb
localai-docker-api-1 | 9:15AM DBG Loading Model 5c7cd056ecf9a4bb5b527410b97f48cb with gRPC (file: /build/models/5c7cd056ecf9a4bb5b527410b97f48cb) (backend: llama-cpp): {backendString:llama-cpp model:5c7cd056ecf9a4bb5b527410b97f48cb threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0000bae00 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh petals:/build/backend/python/petals/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
localai-docker-api-1 | 9:15AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp
localai-docker-api-1 | 9:15AM DBG GRPC Service for 5c7cd056ecf9a4bb5b527410b97f48cb will be running at: '127.0.0.1:44089'
localai-docker-api-1 | 9:15AM INF [llama-cpp] Fails: fork/exec /tmp/localai/backend_data/backend-assets/grpc/llama-cpp: permission denied
localai-docker-api-1 | 9:15AM INF [llama-ggml] Attempting to load
localai-docker-api-1 | 9:15AM DBG GRPC Service for 5c7cd056ecf9a4bb5b527410b97f48cb will be running at: '127.0.0.1:44789'
localai-docker-api-1 | 9:15AM INF [rwkv] Fails: fork/exec /tmp/localai/backend_data/backend-assets/grpc/rwkv: permission denied
...
ocalai-docker-api-1 | 9:15AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/whisper
localai-docker-api-1 | 9:15AM DBG GRPC Service for 5c7cd056ecf9a4bb5b527410b97f48cb will be running at: '127.0.0.1:42503'
localai-docker-api-1 | 9:15AM INF [whisper] Fails: fork/exec /tmp/localai/backend_data/backend-assets/grpc/whisper: permission denied
localai-docker-api-1 | 9:15AM INF [stablediffusion] Attempting to load
...
localai-docker-api-1 | 9:15AM INF [/build/backend/python/vall-e-x/run.sh] Fails: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
Now with LOCALAI_SINGLE_ACTIVE_BACKEND=true we get the embedding working.
I would recommend making a change to the docker compose yaml file to load by default the .env (and update the documentation since it seems a crucial parameter)
Still the eviction in case of memory error should be tried ?
nvidia-smi
Thu Apr 25 11:19:50 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04 Driver Version: 535.171.04 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2060 Off | 00000000:13:00.0 On | N/A |
| 38% 39C P8 13W / 160W | 4422MiB / 6144MiB | 20% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2158 G /usr/lib/xorg/Xorg 131MiB |
| 0 N/A N/A 2686 G /usr/bin/gnome-shell 67MiB |
| 0 N/A N/A 3376 G /usr/bin/nextcloud 3MiB |
| 0 N/A N/A 24782 G ...30092458,1701102826035513081,262144 50MiB |
| 0 N/A N/A 1647486 C python 0MiB |
| 0 N/A N/A 1647698 C python 0MiB |
+---------------------------------------------------------------------------------------+
from localai.
I believe that the eviction process is being assessed atm, maybe related to #2047 and #2102
from localai.
Related Issues (20)
- 2.13.0 failes to build caused by missing path pkg/grpc/proto HOT 3
- parler-tts error HOT 1
- Bug: PyTorch error with OneApi 2024.1 HOT 3
- WebUI enhancements HOT 5
- Incorrect version causing formatting issue in docs HOT 4
- Add grpc service to query info about backend HOT 1
- Absolute paths not being respected in model configs HOT 2
- Templates cannot be outside models directory, models can HOT 2
- Support Rockchip RK3588 / NPU / Mali-G610 GPU HOT 2
- wired and missing api calls in /metrics HOT 1
- Fail to select GPU when OpenVINO returns multiple GPUs HOT 4
- OpenVINO on Arc A770 exception `Check 'written_size == size' ` HOT 3
- Italian ramblings? HOT 1
- Italian ramblings... HOT 2
- OpenVINO A770 segfaults after some number of tokens HOT 6
- GRPC Service Not Ready HOT 10
- Response needs to process functions [panic: Unrecognized schema: map[]] HOT 20
- WebUI: 'Authorization header missing' HOT 5
- docker compose up start failure HOT 1
- UI: Model Gallery: indicate trust_remote_code HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from localai.