runpod / runpod-python Goto Github PK

View Code? Open in Web Editor NEW

150.0 5.0 48.0 14.65 MB

🐍 | Python library for RunPod API and serverless worker SDK.

Home Page: https://pypi.org/project/runpod/

License: MIT License

Python 100.00%

api artificial-intelligence cloud-gpu gpu machine-learning runpod sdk-python serverless

runpod-python's Introduction

RunPod | Python Library

Welcome to the official Python library for RunPod API & SDK.

Table of Contents
💻 | Installation
⚡ | Serverless Worker (SDK)
- Quick Start
- Local Test Worker
📚 | API Language Library (GraphQL Wrapper)
- Endpoints
- GPU Cloud (Pods)
📁 | Directory
🤝 | Community and Contributing

💻 | Installation

# Install the latest release version
pip install runpod

# or

# Install the latest development version (main branch)
pip install git+https://github.com/runpod/runpod-python.git

Python 3.8 or higher is required to use the latest version of this package.

⚡ | Serverless Worker (SDK)

This python package can also be used to create a serverless worker that can be deployed to RunPod as a custom endpoint API.

Quick Start

Create a python script in your project that contains your model definition and the RunPod worker start code. Run this python code as your default container start command:

# my_worker.py

import runpod

def is_even(job):

    job_input = job["input"]
    the_number = job_input["number"]

    if not isinstance(the_number, int):
        return {"error": "Silly human, you need to pass an integer."}

    if the_number % 2 == 0:
        return True

    return False

runpod.serverless.start({"handler": is_even})

Make sure that this file is ran when your container starts. This can be accomplished by calling it in the docker command when you set up a template at runpod.io/console/serverless/user/templates or by setting it as the default command in your Dockerfile.

See our blog post for creating a basic Serverless API, or view the details docs for more information.

Local Test Worker

You can also test your worker locally before deploying it to RunPod. This is useful for debugging and testing.

python my_worker.py --rp_serve_api

📚 | API Language Library (GraphQL Wrapper)

When interacting with the RunPod API you can use this library to make requests to the API.

import runpod

runpod.api_key = "your_runpod_api_key_found_under_settings"

Endpoints

You can interact with RunPod endpoints via a run or run_sync method.

endpoint = runpod.Endpoint("ENDPOINT_ID")

run_request = endpoint.run(
    {"your_model_input_key": "your_model_input_value"}
)

# Check the status of the endpoint run request
print(run_request.status())

# Get the output of the endpoint run request, blocking until the endpoint run is complete.
print(run_request.output())

endpoint = runpod.Endpoint("ENDPOINT_ID")

run_request = endpoint.run_sync(
    {"your_model_input_key": "your_model_input_value"}
)

# Returns the job results if completed within 90 seconds, otherwise, returns the job status.
print(run_request )

GPU Cloud (Pods)

import runpod

runpod.api_key = "your_runpod_api_key_found_under_settings"

# Get all my pods
pods = runpod.get_pods()

# Get a specific pod
pod = runpod.get_pod(pod.id)

# Create a pod
pod = runpod.create_pod("test", "runpod/stack", "NVIDIA GeForce RTX 3070")

# Stop the pod
runpod.stop_pod(pod.id)

# Resume the pod
runpod.resume_pod(pod.id)

# Terminate the pod
runpod.terminate_pod(pod.id)

📁 | Directory

.
├── docs               # Documentation
├── examples           # Examples
├── runpod             # Package source code
│   ├── api_wrapper    # Language library - API (GraphQL)
│   ├── cli            # Command Line Interface Functions
│   ├── endpoint       # Language library - Endpoints
│   └── serverless     # SDK - Serverless Worker
└── tests              # Package tests

🤝 | Community and Contributing

We welcome both pull requests and issues on GitHub. Bug fixes and new features are encouraged, but please read our contributing guide first.

runpod-python's People

Contributors

Stargazers

Watchers

Forkers

jimgoo weltonwang02 rakudeji skullmag rohitmishra94 kodxana kezzsim dan-online lapp0 thecloudco winglian arsenyinfo utensil lukewood shibanovp apollohuang1 sorokinvld therealadityashankar wdshin jorghi12 hommayushi3 alexeevit lkskstlr marok alarazin hatimcodeforever martinklefas tleyden rutvik-runpod direlines alpayariyak vendi-ai michaelvll rachfop lixw1994 dymek91 justinwlin yamanahlawat andrey-sideprojects villqrd kklemon l0wfear joennlae tdi dvejmz saasfun

runpod-python's Issues

Logging Review

Logs will become persistent and available to users even when a worker has been terminated. The following changes need to occur:

Fewer Info Logs
Move Info to Debug
Do not print out job info input or output
No multiline logs
Try and capture standard out better.

API Wrapper Error Handling

When no GPU is available the following is returned:

Curl returns the error "{"errors":[{"message":"There are no longer any instances available with the requested specifications. Please refresh and try again.","path":["podFindAndDeployOnDemand"],"extensions":{"code":"RUNPOD"}}],"data":{"podFindAndDeployOnDemand":null}}"

This needs to be handled by the language API and returned.

nvidia base image for GPU worker?

Hi 👋

I was wondering how the example Dockerfile can run on a GPU without having any base image with cuda?

Thanks a lot

Cheers,

Fra

Create a pod with my Network Volume

Is your feature request related to a problem? Please describe.
My feature request is prompted by an issue. Currently, while deploying a pod through the web interface, I have the option to select the storage. However, using the CLI or Python library, I do not have the ability to choose the storage

Describe the solution you'd like
I would like to have the capability to create pods using the storage Network Volume. This feature would enable me to conveniently attach and detach network-attached storage to my pods, simplifying the management and scalability of my applications that rely on persistent data storage

Describe alternatives you've considered
I can do it only on the web interface, nothing else.

Additional context
I would like to create my pod using a script.

Expanded Debug Functionality

When the mode it set to debug, it also returns additional information, including host id and network speeds.

Missing Log Level

Using an env var, set the logging level to be be less vervose

Uninformative error message

Thanks for making runpod! Its great!

I'm getting:

import runpod

endpoint = runpod.Endpoint("pixi-inpaint-api")

run_request = endpoint.run_sync(
    {
        "image_url": "https://i.imgur.com/FxcvsYN.png",
        "prompt": "A corgi wearing cool sunglasses",
        "mask_url": "https://i.imgur.com/wl3vekS.png",
    }
)

➜  test git:(runpod) ✗ python run.py
Initialized endpoint: pixi-inpaint-api
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/requests/models.py", line 971, in json
    return complexjson.loads(self.text, **kwargs)
  File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

and dont know how to debug

Im getting therunpod.error.QueryError error

Describe the bug
when trying to run : runpod.create_pod("tortoise", "dillonbishop/ai-tts:latest", "NVIDIA GeForce RTX 3070",docker_args="bash -c 'apt update;DEBIAN_FRONTEND=noninteractive apt-get install openssh-server -y;mkdir -p ~/.ssh;cd $_;chmod 700 ~/.ssh;echo \"$PUBLIC_KEY\" >> authorized_keys;chmod 700 authorized_keys;service ssh start;docker pull dillonbishop/ai-tts:latest'")

im getting this error:

runpod.error.QueryError: Something went wrong. Please try again later or contact support.

heres the full trace back:
Traceback (most recent call last):
File "D:\PycharmProjects\aibigcontentcreation\main.py", line 599, in
wholething()
File "D:\PycharmProjects\aibigcontentcreation\main.py", line 578, in wholething
runpod.create_pod("tortoise", "dillonbishop/ai-tts:latest", "NVIDIA GeForce RTX 3070",docker_args="bash -c 'apt update;DEBIAN_FRONTEND=noninteractive apt-get install openssh-server -y;mkdir -p ~/.ssh;cd $_;chmod 700 ~/.ssh;echo "$PUBLIC_KEY" >> authorized_keys;chmod 700 authorized_keys;service ssh start;docker pull dillonbishop/ai-tts:latest'")
File "C:\Users\Dillon's PC\AppData\Local\Programs\Python\Python310\lib\site-packages\runpod\api\ctl_commands.py", line 92, in create_pod
raw_response = run_graphql_query(
File "C:\Users\Dillon's PC\AppData\Local\Programs\Python\Python310\lib\site-packages\runpod\api\graphql.py", line 30, in run_graphql_query
raise error.QueryError(response.json()["errors"][0]["message"])
runpod.error.QueryError: Something went wrong. Please try again later or contact support.

Is it possible to query a running server-less task for intermediate output?

For example let's say I'm using runpod serverless for training a gan on MNIST digits.
Often during training like this one would want to retrieve some visual data regarding the process, maybe a base64 image and other json data for performance metrics.
My actual use case is regarding potentially long-running inference with several epochs of refinement, but is something like this possible?
Best,
Tristan

API Worker

Implement API within a worker with exposed port to make fast job requests.

Support passing a template ID when starting a pod via createPod

Is your feature request related to a problem? Please describe.

I don't see a way to specify a custom template ID when starting a pod via createPod()

Describe the solution you'd like

Add a new parameter to createPod() which accepts a template ID and adds it to the graphql mutation.

input_fields.append(f'templateId: "{template_id}"')

Transparent Large Response Handling

Current behavior: If a json response is more than 2mb an error response is returned: "error":"json is invalid; max size is 2 MB"

Desired behavior: 2MB responses are handled seamlessly, either

server side change: allow >2MB files, or
runpod-python change: make transparent changes to runpod-python:
- upload files larger than 2MB to s3 bucket on server side handler
- provide json response indicating s3 bucket was used
- on client-side, have output() download from s3 bucket.

Runpod python

Runpod on Google Colab : it used up my google disk space.

1.Followed the instruction on this blog : https://blog.runpod.io/how-to-connect-google-colab-to-runpod/ .
2.Connected to Jupyter Notebook.
3. How do we create an SSH tunnel with port forwarding. Where are we supposed to create it? is it on local cmd, jupyter terminal, a different app altogether.
4. How do I connect my runpod to Google colab so that it stops eating my Google drive disk space. Even after paying you almost $4/hr to run my LLM models.

S3 image ContentType differs from image format

I am not sure whether this is a bug. If I understand correctly, you are saving the image in arbitrary format e.g. jpeg, png, etc.
But you are setting ContentType to be fixed at image/png. Shouldn't the ContentType be the same as the format saved?
If this is not a bug, can you tell me the logic behind this?

runpod-python/runpod/serverless/utils/rp_upload.py

Lines 99 to 110 in cf7dae6

    
           with Image.open(image_location) as img: 
        
               output = BytesIO() 
        
               img.save(output, format=img.format) 
        
               output.seek(0) 
        
               bucket = time.strftime('%m-%y') 
        
               boto_client.put_object( 
        
                   Bucket=f'{bucket}', 
        
                   Key=f'{job_id}/{image_name}.png', 
        
                   Body=output.getvalue(), 
        
                   ContentType="image/png" 
        
               )

Handle 404 Gracefully on job done

Users will now be able to set the TTL for their jobs, a situation could arise where the job times out while in progress. When this occurs, and the job is returned, the returned status will be 404.

Currently, this raises an exception; instead, this should be handled gracefully with a logged error and note as to what occurred.

Any support for resuming a spot-instance workload

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

create pods template features not supported

Describe the bug
templateId is supported in [](https://github.com/runpod/runpodctl/blob/main/doc/runpodctl_create_pod.md)

To Reproduce
Steps to reproduce the behavior:

pod = runpod.create_pod(name=myName,cloud_type="community",gpu_type_id="A100 80GB",gpu_count=1, templateId="2rg1vx16js")

Expected behavior
The api supports the same functions as the cli

Screenshots
n/a

Desktop (please complete the following information):
n/a

Smartphone (please complete the following information):
n/a

Additional context

Add Realtime Example

More Debug Output to Higher Level

Describe the bug

When using rp_debugger it currently appends the debug to the output of the job as a keyed dictionary item. This only works if the output from the job is a dict type object. It will not work for outputs that are single str, int or bool.

A temp fix will be to only add the debug when the output is a dic type object, otherwise an error will be raised.

CLI Feature Suite

Provide a better user experience through improved UI, allowing users to deploy endpoints with minimal configurations/clicks.

Since this focuses on the users' experience we would like to start off at the same point users will, the docs. The first step will be to draft the CLI docks, providing examples and descriptions as to how a user would make use of this (to be created) CLI functionality.

How to use private docker image? How to specify`container registry credentials` from sdk

Describe the bug

When creating a pod using the sdk with a private docker image, the pod is created but fails to build the docker image. I get the following log and the pod retries infinitely. The web client shows that the container registry credentials are not selected for that pod. When I manually edit the container registry credentials, the image builds successfully. I expect some way to select this, or the default option to be selected just like when using the web client to create a pod. Looking at the code for 5 min, this isn't obvious (maybe docker_args could do something, but couldn't figure out). Sorry if I missed something.

2023-10-10T02:23:26Z create 300GB volume
2023-10-10T02:23:26Z create container allganize/llm_train:cuda12_0.1
2023-10-10T02:23:27Z error pulling image: Error response from daemon: pull access denied for allganize/llm_train, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
2023-10-10T02:23:27Z error creating container: Error response from daemon: No such image: allganize/llm_train:cuda12_0.1
2023-10-10T02:23:42Z create container allganize/llm_train:cuda12_0.1
2023-10-10T02:23:43Z error pulling image: Error response from daemon: pull access denied for allganize/llm_train, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
2023-10-10T02:23:43Z error creating container: Error response from daemon: No such image: allganize/llm_train:cuda12_0.1
2023-10-10T02:23:58Z create container allganize/llm_train:cuda12_0.1
2023-10-10T02:23:59Z error pulling image: Error response from daemon: pull access denied for allganize/llm_train, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
...

Code for creation.

...
            pod = runpod.create_pod(
                name=args.name,
                image_name=args.image_name,
                gpu_type_id=args.gpu,
                gpu_count=args.gpu_count,
                cloud_type=args.cloud_type,
                volume_in_gb=args.volume_size,
                container_disk_in_gb=args.container_size,
                ports="80/http,29500/http",
                env={"HUGGING_FACE_HUB_TOKEN": hf_token},
            )

A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior

Screenshots

Report Version

Add a parameter to ping that will report the SDK version.

Generator support changes incorrect

Describe the bug
from the original PR I submitted, the code looked like

runpod-python/runpod/serverless/work_loop.py

Lines 63 to 71 in f19342c

    
           else: 
        
               job_result = run_job(config["handler"], job) 
        
               # check if job result is a generator 
        
               if isinstance(job_result, types.GeneratorType): 
        
                   log.debug("Job result is a generator, streaming output.") 
        
                   for job_stream in job_result: 
        
                       await stream_result(session, job_stream, job) 
        
                   job_result = None

and now is https://github.com/runpod/runpod-python/blob/main/runpod/serverless/work_loop.py#L66-L72

Logically, this is incorrect, as you can't test the handler function is a GeneratorType, you have to actually call the function, and only then can you determine if it is a GeneratorType from the result of the function call.

A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots

Desktop (please complete the following information):

OS: [e.g. iOS]
Browser [e.g. chrome, safari]
Version [e.g. 22]

Smartphone (please complete the following information):

Device: [e.g. iPhone6]
OS: [e.g. iOS8.1]
Browser [e.g. stock browser, safari]
Version [e.g. 22]

Additional context
Add any other context about the problem here.

Test Output Save Location

Include an option to save a test output to a specified folder.

Progress Update

Add a function to allow status updates while a job is in progress.

More Concurrency on Async Worker

Alow async workers to pull and work on more than 1 job, essentially allowing a worker to act as multiple workers.

support opentelemetry

When the runpod handle a request, I want to trace it, support opentelemetry will be helpful

do not print output if long

need to make output printing optional flag to help with debugging

Improved rp_download utility

review the download utility and provide better error reporting, more flexibility and an e2e example for the docs.

Minimal runpod python client

Hi,
I really like the service and the python package is slim and neat 🚀 Just wrt. installation I would have a request :)

Is your feature request related to a problem? Please describe.
The full runpod install requires python 3.10 and multiple (large) dependencis. Both are problems in a constrained (e.g. AWS lambda) environment. It seems the api_wrapper doesn't use any of these features and could be standalone and slim.

Describe the solution you'd like
An install process which gives one the api_wrapper with minimal dependencies.

Signed URLs being created in mixed/wrong regions from rp_upload

Describe the bug
Having followed the steps here:
https://docs.runpod.io/docs/using-s3-to-upload-files-with-serverless

I've got a serverless template that's uploading files successfully to my s3 bucket:
https://crypts-test.s3.eu-west-2.amazonaws.com/

Unfortunately all the URLs created to access these files contain an error in that they're mixed region such as the below:
https://crypts-test.s3.eu-west-2.amazonaws.com/09-23/patches/98ad184c.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5LZJBQWDDB5D6F27%2F20230911%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230911T112129Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=9fc3ce06538c4fb1a88a721ac890c0f74397d0226ed57c4aad036eb985cece27
You can see that at the start the bucket ID is correct and has eu-west-2 in it, but half way through it says %2Fus-east-1%2F which is causing an error. If you follow that link it complains about region mismatches.

A correct link should look like this:
https://crypts-test.s3.amazonaws.com/WSI/110-B3-HP.ndpi_ROI__mpp0.44_reg000_crop_sk00000_%280%2C0%2C15855%2C15855%29.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5LZJBQWDDB5D6F27%2F20230911%2Feu-west-2%2Fs3%2Faws4_request&X-Amz-Date=20230911T121426Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=8187a969fca6122d02a380d954c165fd8e1d49be2c34e59eb7c4213e53fb7473

To Reproduce
Steps to reproduce the behavior:

create an s3 bucket in eu-west-2 region
create endpoint as in https://docs.runpod.io/docs/using-s3-to-upload-files-with-serverless
get output
click on link to see error message or not

Expected behavior
I link should be created that leads to the image file on s3 directly

Screenshots

Desktop (please complete the following information):

OS: windows
Browser chrome
Version latest

Additional context
None of the above links contain anything confidential, but by their nature they will expire. The actual text of the link contains the error too, but please let me know if you need me to generate any new examples.

s3Config Re-Name

Add support for objectPath and storageUrl as part of s3Config.

This will be to reduce confusion when supplying this information to an endpoint since there are two types of host URL formats.
Backward compatibility should be maintained.

Support Worker Termination

A user can send back terminate_worker, and this will kill the worker.

How to combine private GPU and RunPod GPU?

I would like to deploy queue functions like RunPod endpoint on private Gpu as well, target to save costs by combining private Gpus and RunPod Gpus.
Can this function be deployed in the repository?

Text-Generation-Inference Docker container not working with quantize parameter

When spinning up the docker container with this code that uses the TGI docker image:

    pod = runpod.create_pod(
        name="tiiuae/falcon-7b-instruct",
        image_name="ghcr.io/huggingface/text-generation-inference:1.0.0",
        gpu_type_id="NVIDIA GeForce RTX 3090",
        cloud_type="COMMUNITY",
        docker_args=docker_args, # SEE BELOW
        gpu_count=1,
        volume_in_gb=30,
        container_disk_in_gb=5,
        ports="80/http",
        volume_mount_path="/data",
    )

if docker_args includes the quantize property, (quantize is True) e.g.:
docker_args = "--model-id tiiuae/falcon-7b-instruct--num-shard 1 "

if quantize:
    docker_args += " --quantize bitsandbytes"

The container log spits out:

2023-08-10T11:30:29.101220272-06:00 /opt/conda/lib/python3.9/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.

From HF support (Nicolas Patry): "This means for whatever reason the pod you're using cannot see the GPU. There are issues opened for that directly at runpod if I'm not mistakened. But we've seen people having issues with runpod before. Something about shm not being properly set or something."

From Nicolas/HF perspective, the bug is in how the pod is set up. He believes it has to do with not properly setting up shared memory?

To Reproduce

spin up a pod with the above create_pod settings and docker_args set to
docker_args = "--model-id tiiuae/falcon-7b-instruct --num-shard 1 --quantize bitsandbytes"

if you spin up without --quantize:
docker_args = "--model-id tiiuae/falcon-7b-instruct --num-shard 1

the container will load correctly. Yet, the feature that minimizes the amount of memory which will minimize the cost and inference time are not being used.

Expected behavior
The container will use less memory and the inference times will be shorter.

Thank you.

Prepare for 1.0.0 Release

Consider alternative names for handler_fully_utilized
Check on the implementation of job_in_progress
Document new functionality (multi-job and streaming)
Close outstanding issues
Marge/Close PRs
Backwards compatibility down to 3.8

Test Validation:

Deploy dev versions of endpoints with new worker

Bonus:

Social Media Card Image

Listen to multiple requests when running locally

Is your feature request related to a problem? Please describe.
I want to test the serverless function locally from the web app I'm developing, and if I've read the docs correctly, this is not possible to do given how this works atm. Problem is that the only supported input, is either a file or direct input from the CLI. I would like to be able to send one or more POST requests to the serverless function without it terminating.

There should really be support for listening to incoming HTTP requests indefinitely, otherwise it's not really possible to develop web applications locally in an efficient manner.

Additional context
Maybe this is possible to do somehow but then the docs are not very clear about it in my opinion.

runpod.start_pod(pod.id) don

runpod.start_pod(pod.id) don't work.

After inserting the pod.id, the runpod.start_pod command doesn't work. There was an attribute error that says that "AttributeError: module 'runpod' has no attribute 'start_pod'". There was no problem with runpod.stop_pod(pod.id), and runpod.terminate_pod(pod.id). Just the start_pod function.

Developer Experience Improvements

Development template with a web server and ssh setup for users to create a worker pod.

Simulate the aiapi functionality.

Template repo.

POST/GET

Heartbeat Refactor

Rename for consistency heartbeat.py > rp_ping.py

Proposed changes:

Move ping URL constructor completely out of worker_state and into rp_ping

We can leave worker_state as the source where jobs currently in progress will be stored for now.

Runpod limits for serverless

Hi,
I am currently unable to run a docker image (that runs nicely here) on the runpod serverless infrastructure... I continue getting a timeout error.
I am thinking I am hitting some limit, so my question: which are the runpod limits for serverless run? i.e. maximum "startup" time, maximum memory, maximum disk etc.

Thanks!
Luigi

bug: KeyError 'input'

I've noticed recently that this error occurs within the library and thought I'd open an issue. It seems that input sometimes does not exist in job, specifically in old jobs that have been in the queue for a while. Therefore an exception occurs on this line:
https://github.com/runpod/runpod-python/blob/main/runpod/serverless/work_loop.py#L44

> if job["input"] is None:
	 log.error(f"Job {job['id']} has no input parameter provided. Skipping this job.")
	 continue

The reason I opened an issue before a pull request is because input should always exist, even with old jobs, so I would like your input on this issue.

Thanks,
Dan

Add upload progress

import boto3
import os
import time

def print_progress(bytes_uploaded, total_bytes):
    upload_speed = bytes_uploaded / (time.time() - start_time)
    print(f"Uploaded {bytes_uploaded} bytes out of {total_bytes}. Upload speed: {upload_speed:.2f} bytes/s")

s3 = boto3.client('s3')

bucket_name = 'my-bucket'
file_path = '/path/to/my/file'

file_size = os.path.getsize(file_path)

start_time = time.time()

s3.upload_file(file_path, bucket_name, 'my-object-key', Callback=print_progress)

Google cloud storage support?

Now only aws s3 is supported. Are there any plans to add support for Google cloud storage?

The custom template container disk size is overwritten by python default code

Describe the bug
When a custom template ID is provided, the python code will use the default value:

container_disk_in_gb:int=5

instead of using the value in the custom template.

A workaround is to do a graphql query to get the template containerInGb field, then pass that to runpod.create_pod().

    def get_custom_template(self, template_name: str):
        # Call the "me" api and ask for the list of templates
        QUERY_POD = """
            query myTemplates {
                myself {
                    podTemplates {
                        id
                        name
                        imageName
                        containerDiskInGb
                    }
                }
            }
            """
        result = run_graphql_query(QUERY_POD)
        pod_templates = result["data"]["myself"]["podTemplates"]
        for pod_template in pod_templates:
            if pod_template["name"] == template_name:
                return {
                    "containerDiskInGb": pod_template["containerDiskInGb"], 
                    "id:pod_template["id"], 
                    imageName:pod_template["imageName"]
                }
                return custom_template

        raise Exception("Could not find pod template id for template name: " + template_name)

Expected behavior
When a template ID is passed, use the container disk size specified in the template

worker-utils.md - document doesn't exist

Describe the bug
I'm using the rp_upload and in the logs I see this when the BUCKET_ENDPOINT_URL is not configured:

2023-10-11T13:43:04.926575928Z If this is a live endpoint, please reference the following:
2023-10-11T13:43:04.926590519Z https://github.com/runpod/runpod-python/blob/main/docs/serverless/worker-utils.md

To Reproduce
Steps to reproduce the behavior:

Install runpod
Use rp_upload in your handler
Deploy the image to runpod
Use the run endpoint and see the message in the logs

Expected behavior
Link to the correct document.

Expanded error reporting

Provide additional information to assist with errors and debugging.

{
  "error": {
            "message": "Bad stuff happend",
            "podId": 123,
            "hostName": "sad-server"
        }
}

mirror curl api

Looks like this is a new project just getting started. Would be great if the simple runpod curl/cli commands were part of the lib like the https://github.com/runpod/runpodctl

[1.1.3] Docs outdated / not according to actual pip package

According to the docs these commands are supported:

Get all my pods

pods = runpod.get_pods()

Get a specific pod

pod = runpod.get_pod(pod.id)

,
however,

>>> runpod.get_pods()
{'errors': [{'message': 'Something went wrong. Please try again later or contact support.', 'locations': [{'line': 2, 'column': 15}], 'extensions': {'code': 'GRAPHQL_PARSE_FAILED'}}]}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/korny/micromamba/envs/autoscaler-consul/lib/python3.11/site-packages/runpod/api_wrapper/ctl_commands.py", line 37, in get_pods
    raw_return = run_graphql_query(pod_queries.QUERY_POD)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/korny/micromamba/envs/autoscaler-consul/lib/python3.11/site-packages/runpod/api_wrapper/graphql.py", line 32, in run_graphql_query
    raise error.QueryError(response.json()["errors"][0]["message"])
runpod.error.QueryError: Something went wrong. Please try again later or contact support.



>>> runpod.get_pod("myid")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'runpod' has no attribute 'get_pod'. Did you mean: 'get_pods'?

	with Image.open(image_location) as img:
	output = BytesIO()
	img.save(output, format=img.format)
	output.seek(0)

	bucket = time.strftime('%m-%y')
	boto_client.put_object(
	Bucket=f'{bucket}',
	Key=f'{job_id}/{image_name}.png',
	Body=output.getvalue(),
	ContentType="image/png"
	)

	else:
	job_result = run_job(config["handler"], job)

	# check if job result is a generator
	if isinstance(job_result, types.GeneratorType):
	log.debug("Job result is a generator, streaming output.")
	for job_stream in job_result:
	await stream_result(session, job_stream, job)
	job_result = None

runpod / runpod-python Goto Github PK

runpod-python's Introduction

RunPod | Python Library

Table of Contents

💻 | Installation

⚡ | Serverless Worker (SDK)

Quick Start

Local Test Worker

📚 | API Language Library (GraphQL Wrapper)

Endpoints

GPU Cloud (Pods)

📁 | Directory

🤝 | Community and Contributing

runpod-python's People

Contributors

Stargazers

Watchers

Forkers

runpod-python's Issues

Get all my pods

Get a specific pod

Recommend Projects

Recommend Topics

Recommend Org