Comments (9)
documentation is getting some revamp, and @mkellerman worked a nice integration with chatgpt-web: an e2e docker-compose file would be just great!
from localai.
models are not bundled in the image due to licensing - models like gpt4all, alpaca, and vicuna are based on LLaMA from Facebook, which prohibits modifications, alterations, and re-distributions of the weights in every form. See for instance nomic-ai/gpt4all#75.
Sadly, until there is a model with a free license that allows re-distribution, we can't embed it in the image, or we risk yet another DCMA takedown. You need to get the model somehow, and specify it as described in https://github.com/go-skynet/llama-cli#using-other-models
from localai.
models are not bundled in the image due to licensing - models like gpt4all, alpaca, and vicuna are based on LLaMA from Facebook, which prohibits modifications, alterations, and re-distributions of the weights in every form. See for instance nomic-ai/gpt4all#75.
Sadly, until there is a model with a free license that allows re-distribution, we can't embed it in the image, or we risk yet another DCMA takedown. You need to get the model somehow, and specify it as described in https://github.com/go-skynet/llama-cli#using-other-models
I get this error despite mounting. Here's my command: sudo docker run -v ~/llama_models/gpt4-x-alpaca-13b-native-4bit-128g/gpt4-x-alpaca-13b-ggml-q4_1-from-gptq-4bit-128g/:/models -p 8080:14004 -ti --rm quay.io/go-skynet/llama-cli:v0.4 api --context-size 700 --threads 12 --alpaca true --model /models/model.bin
I have a model.bin file inside the ~/llama_models/gpt4-x-alpaca-13b-native-4bit-128g/gpt4-x-alpaca-13b-ggml-q4_1-from-gptq-4bit-128g folder
from localai.
Can you try by using the MODEL_PATH
env var instead?
sudo docker run -e MODEL_PATH=/models/model.bin -v ~/llama_models/gpt4-x-alpaca-13b-native-4bit-128g/gpt4-x-alpaca-13b-ggml-q4_1-from-gptq-4bit-128g/:/models -p 8080:14004 -ti --rm quay.io/go-skynet/llama-cli:v0.4 api --context-size 700 --threads 12 --alpaca true
Just noticed this is being set on the main container image, a fix is landing in master! (bf85a31)
from localai.
@regstuff now the master
image is fixed, you can also try with the same command but using quay.io/go-skynet/llama-cli:latest
instead
from localai.
@mudler Its not working for me either....
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk 10000 --gpt4all=true --model ./models/ggml-alpaca-7b-q4.bin
llama_model_load: failed to open './models/ggml-alpaca-7b-q4.bin'
llama_bootstrap: failed to load model from './models/ggml-alpaca-7b-q4.bin'
Loading the model failed: failed loading model
➜ llama-cli
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk --model ./models/ggml-alpaca-7b-q4.bin
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk 10000 --alpaca true --model ./models/ggml-alpaca-7b-q4.bin
llama_model_load: failed to open '/model.bin'
llama_bootstrap: failed to load model from '/model.bin'
Loading the model failed: failed loading model
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk 10000 --alpaca "true" --model ./models/ggml-alpaca-7b-q4.bin
llama_model_load: failed to open '/model.bin'
llama_bootstrap: failed to load model from '/model.bin'
Loading the model failed: failed loading model
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk 10000 --model ./models/ggml-alpaca-7b-q4.bin
llama_model_load: failed to open './models/ggml-alpaca-7b-q4.bin'
llama_bootstrap: failed to load model from './models/ggml-alpaca-7b-q4.bin'
Loading the model failed: failed loading model
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk 10000 --model ./models/ggml-alpaca-7b-q4.bin --alpaca true
llama_model_load: failed to open './models/ggml-alpaca-7b-q4.bin'
llama_bootstrap: failed to load model from './models/ggml-alpaca-7b-q4.bin'
Loading the model failed: failed loading model
➜ llama-cli docker run -ti -e MODEL_PATH=/models/ggml-alpaca-7b-q4.bin --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk 10000 --model ./models/ggml-alpaca-7b-q4.bin --alpaca true
llama_model_load: failed to open './models/ggml-alpaca-7b-q4.bin'
llama_bootstrap: failed to load model from './models/ggml-alpaca-7b-q4.bin'
Loading the model failed: failed loading model
➜ llama-cli docker run -ti -e MODEL_PATH=/models/ggml-alpaca-7b-q4.bin --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk 10000
llama_model_load: failed to open '/models/ggml-alpaca-7b-q4.bin'
llama_bootstrap: failed to load model from '/models/ggml-alpaca-7b-q4.bin'
Loading the model failed: failed loading model
➜ llama-cli docker run -ti -e MODEL_PATH=/models/ggml-alpaca-7b-q4.bin --rm quay.io/go-skynet/llama-cli:v0.4 --instruction "What's an alpaca?" --topk 10000
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --model ./models/ggml-alpaca-7b-q4.bin --alpaca true
Unable to find image 'quay.io/go-skynet/llama-cli:latest' locally
latest: Pulling from go-skynet/llama-cli
3e440a704568: Already exists
68a71c865a2c: Already exists
670730c27c2e: Already exists
5a7a2c95f0f8: Already exists
db119aaf144b: Already exists
92ac76a462cb: Pull complete
5997e4205ef7: Pull complete
33d4a96cf7d6: Pull complete
c8a35e5c3705: Pull complete
abacb88fc6dd: Pull complete
756caf9df70c: Pull complete
0a7f01cc46c5: Pull complete
92ed784c8873: Pull complete
Digest: sha256:3698dea8ece687b23903afe347cee47b37d6883053533eacfab26619b55b97c7
Status: Downloaded newer image for quay.io/go-skynet/llama-cli:latest
llama_model_load: failed to open './models/ggml-alpaca-7b-q4.bin'
llama_bootstrap: failed to load model from './models/ggml-alpaca-7b-q4.bin'
Loading the model failed: failed loading model
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --alpaca rue --model ./models/ggml-alpaca-7b-q4.bin
llama_model_load: failed to open ''
llama_bootstrap: failed to load model from ''
Loading the model failed: failed loading model
➜ llama-cli
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --alpaca true --model ./models/ggml-alpaca-7b-q4.bin
llama_model_load: failed to open ''
llama_bootstrap: failed to load model from ''
Loading the model failed: failed loading model
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:latest --alpaca true --instruction "What's an alpaca?" --topk 10000 --model ./models/ggml-alpaca-7b-q4.bin
llama_model_load: failed to open ''
llama_bootstrap: failed to load model from ''
Loading the model failed: failed loading model
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --model ./models/ggml-alpaca-7b-q4.bin --alpaca true
llama_model_load: failed to open './models/ggml-alpaca-7b-q4.bin'
llama_bootstrap: failed to load model from './models/ggml-alpaca-7b-q4.bin'
Loading the model failed: failed loading model
➜ llama-cli docker run -ti --rm quay.io/go-skynet/llama-cli:latest --instruction "What's an alpaca?" --topk 10000 --model ./models/ggml-alpaca-7b-q4.bin
llama_model_load: failed to open './models/ggml-alpaca-7b-q4.bin'
llama_bootstrap: failed to load model from './models/ggml-alpaca-7b-q4.bin'
Loading the model failed: failed loading model
even with the latest image
from localai.
The project is great, but I'd recommend refactoring the documentation to make it clearer.
It's kind of confusing to understand what to do.
I'm also preparing a docker-compose.yml file which I can share when its done
from localai.
Hi @jonit-dev ,
You need to specify a volume to docker so it mounts a path local to the host inside the container with -v, see the instructions here
https://github.com/go-skynet/llama-cli#using-other-models
For a docker compose file, have a look at #10
On the other hand I do completely agree, I will rework the documentation as soon as possible, there are many lacunas and also other new features being added that needs to be documented too.
from localai.
instructions updated to run with docker-compose, and multi-model support too: https://github.com/go-skynet/llama-cli#usage
I'd close this issue for now, if you are still facing issues, just re-open it!
from localai.
Related Issues (20)
- vLLM backend broken in v2.8.0 HOT 2
- Add transparent conversion for "tools" to "functions" in v1/chat/completions endpoint HOT 5
- tools: reply with text when no tool is selected in SSE mode HOT 1
- support functionary models HOT 1
- Replace model in Build tutorial for mac to phi-2 HOT 1
- ERR Failed starting/connecting to the gRPC service HOT 6
- feat. add OpenVINO Model Server as a Backend
- Completion endpoint returns same response repeatedly HOT 4
- latest-cublas-cuda12 tag does not exist on DockerHub or Quay
- Cannot run coqui tts - Error: grpc process not found (image and local docker build) HOT 3
- CUDA Memory - GRPCs do not get reused or alternatively removed HOT 1
- 404 Error when trying voice cloning with Vall-E-X HOT 1
- Whisper V3 model can support long Audio input, can you add an API to support large Audio file as a whole? HOT 1
- Functions: function_call has null arguments HOT 3
- Unable to setup localAI locally using docker compose for integrating it with Flowise HOT 1
- [BUG] Unable to start docker-compose for HOT 4
- Streaming does not work
- Document tinydreams HOT 3
- vLLM streaming mode only streams data once the inference is completely over HOT 2
- Building an image get error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from localai.