Comments (3)
I was unable to reproduce your issue using SECURE cloud and Llama 2 - I can get TGI (lastest) to work with quantize=bitsandbytes
6 docker_args = "--model-id meta-llama/Llama-2-7b-hf --num-shard 1 --quantize bitsandbytes"
1 pod = runpod.create_pod(
2 name="Llama-2-7b-tgi",
3 image_name="ghcr.io/huggingface/text-generation-inference",
4 gpu_type_id="NVIDIA RTX A4000",
5 cloud_type="SECURE",
6 docker_args=docker_args, # SEE BELOW
7 gpu_count=1,
8 volume_in_gb=30,
9 container_disk_in_gb=5,
10 ports="80/http",
11 volume_mount_path="/data",
12 env={
13 "HUGGING_FACE_HUB_TOKEN": HUGGING_FACE_HUB_TOKEN
14 }
15 )
❯ curl https://xxxxxxxxxxx-80.proxy.runpod.net/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":200}}' \
-H 'Content-Type: application/json'
{"generated_text":"\nWhat is Deep Learning? Deep learning is a subset of machine learning that is based on artificial neural networks. Neural networks are algorithms that are inspired by the structure and function of the human brain. Deep learning is a subset of machine learning that is based on artificial neural networks. Neural networks are algorithms that are inspired by the structure and function of the human brain.\nDeep learning is a subset of machine learning that is based on artificial neural networks. Neural networks are algorithms that are inspired by the structure and function of the human brain. Deep learning is a subset of machine learning that is based on artificial neural networks. Neural networks are algorithms that are inspired by the structure and function of the human brain.\nDeep learning is a subset of machine learning that is based on artificial neural networks. Neural networks are algorithms that are inspired by the structure and function of the human brain. Deep learning is a subset of machine learning that is based on artificial neural networks. Neural networks are"}%
from runpod-python.
Thank you for the input @chris-aeviator
@solarslurpi I am not sure this is the best location for this issue unless you think there is a fix within this repo that would resolve things for you.
from runpod-python.
Closing for now, not related to this repo.
from runpod-python.
Related Issues (20)
- Fail to install together with awscli due to the `colorama` dependency HOT 4
- Job stuck in IN_QUEUE status after sending progress update HOT 13
- 404 and Unauthorized API Errors HOT 3
- Not able to install runpod and diffusers together using Poetry HOT 4
- Small typo in the readme file, it says start_pod but it seems the command was updated to resume_pod
- Remove "test-" prefix from the job id HOT 1
- Support "/run" in local dev HOT 1
- GPU availability check before deployment HOT 2
- Can't set bucket name in rp_upload.upload_image
- enable webhooks for local testing of serverless workers
- Implement methods for credentials and template creation
- `BUCKET_ENDPOINT_URL` must contain region otherwise URL signing process fails
- Specifying a Template ID on "create_pod" will cause QueryError: There are no longer any instances available with enough disk space. HOT 3
- Counting Failiures Correctly HOT 5
- Job ID and/or Worker ID provided
- Async `/run` does not work when testing locally HOT 5
- Serverless generator incorrectly handles exceptions
- sls | Return FAILURE status to client when internal `_handle_result` fails
- Enable startJupyter in create_pod()
- Requesting a pod for cpu via python sdk seems not possible
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from runpod-python.