Comments (8)
What's the error? Can you use 0.5.0
instead of latest
?
I'm unable to reproduce on my end.
from text-generation-inference.
ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a `tokenizers` library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
from text-generation-inference.
docker run --gpus all --shm-size 1g -p 8080:80 -v /home/mohamedr/cache:/data ghcr.io/huggingface/text-generation-inference:0.5.0 --model-id Tribbiani/vicuna-7b --num-shard 1 --max-total-tokens 2048
from text-generation-inference.
@OlivierDehaene
Still the same error
from text-generation-inference.
Can you clear your model cache (in /home/mohamedr/cache
) and re-try?
from text-generation-inference.
@OlivierDehaene
Still the same problem
from text-generation-inference.
@OlivierDehaene
It now gives me this error:
ModuleNotFoundError: No module named 'einops'
from text-generation-inference.
It worked somehow in the new versions
from text-generation-inference.
Related Issues (20)
- Not able to run tgi in Google Colab. Shard Cannot Start
- Llama-3 support HOT 19
- Empty 'id' in OpenAI Compatible API Chat Stream
- Response priming (option to provide the initial part of the assistant's message in the API request)
- CodeQwen1.5 not working HOT 3
- TGI seems is not supported that other position encoding, like baichuan2-7B(alibi position). HOT 1
- support llava1.5v
- Not able to install locally HOT 8
- Help me to add NLLB
- The settings of top_k, typical_p, do_sample in the request do not affect the generation?
- Webserver Crashed when serving CommandR-plus HOT 1
- The transformation between repetition_penalty and presence_penalty seems to be incorrect
- CohereForAI/c4ai-command-r-plus-4bit deployment fails on Inference Endpoint HOT 1
- Error "Failed to buffer the request body: length limit exceeded" when supplying base64 encoded images greater than 1MB in prompt HOT 2
- Request failed during generation: Server error: 'FlashMixtral' object has no attribute 'compiled_model' HOT 3
- Unable to start TGI with llama3-70b HOT 1
- The EETQ quantization model cannot be performed locally
- Take into account num_return_sequences to get multiple outputs
- Add support for Phi-3 Model HOT 4
- Inference error for Mistral7b v-0.2 while deploying in Azure VM
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from text-generation-inference.