Light

substratusai / helm Goto Github PK

View Code? Open in Web Editor NEW

8.0 2.0 2.0 89 KB

License: Apache License 2.0

Smarty 100.00%

helm's Introduction

Substratus AI Helm Charts

A collection of exquisitely crafted helm charts for LLMs:

vLLM
Text Generation Inference
Lingo

Enable Substratus Helm repo

helm repo add substratusai https://substratusai.github.io/helm
helm repo update

Usage guides

vLLM

Basic usage:

# Note by default the resource limit is set to 1 GPU
helm install mistral-7b-instruct substratusai/vllm \
  --set model=mistralai/Mistral-7B-Instruct-v0.1

For Advanced usage see: vLLM Chart Guide

helm's People

Contributors

Stargazers

Watchers

Forkers

idadawn sangtnguyen-kms

helm's Issues

Allow for multiple models

Example values.yaml:

models:
   llama-7b: # The "model" that is referenced in the OpenAI-like API
     enabled: true
     source:
        huggingface: meta-llama/Llama-2-7b
        # gcs: ... # future
    resources:
      nvidia.com/gpu: 2
   opt-350m:
     enabled: true
     source:
        huggingface: facebook/opt-350m

404 on getting repo

helm repo add substratusai https://substratusai.github.io/helm
helm repo update

...Unable to get an update from the "substratusai" chart repository (https://substratusai.github.io/helm):
	failed to fetch https://substratusai.github.io/helm/index.yaml : 404 Not Found

Publish the helm charts to artifacthub

https://artifacthub.io/

vLLM chart not compatible with GKE autopilot out of the box

Perhaps we should add instructions for installing on GKE autopilot:

helm upgrade --install mistral-7b-instruct substratusai/vllm -f - << EOF
model: mistralai/Mistral-7B-Instruct-v0.1
replicaCount: 0
env:
- name: SERVED_MODEL_NAME
  value: mistral-7b-instruct-v0.1 # needs to be same as lingo model name
deploymentAnnotations:
  lingo.substratus.ai/models: mistral-7b-instruct-v0.1
  lingo.substratus.ai/min-replicas: "0" # needs to be string
  lingo.substratus.ai/max-replicas: "3" # needs to be string
EOF
Release "mistral-7b-instruct" does not exist. Installing it now.
W1118 19:41:03.659013   45860 warnings.go:70] autopilot-default-resources-mutator:Autopilot updated Deployment default/mistral-7b-instruct-vllm: adjusted resources to meet requirements for containers [vllm] (see http://g.co/gke/autopilot-resources)
Error: admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints.
Violations details: {"[denied by autogke-gpu-limitation]":["You must specify a GPU type with node selector 'cloud.google.com/gke-accelerator' when GPU is requested on Autopilot workloads; supported values are: [nvidia-a100-80gb, nvidia-tesla-a100, nvidia-tesla-t4]."]}

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.