Coder Social home page Coder Social logo

helm's Introduction

Substratus AI Helm Charts

A collection of exquisitely crafted helm charts for LLMs:

  • vLLM
  • Text Generation Inference
  • Lingo

Enable Substratus Helm repo

helm repo add substratusai https://substratusai.github.io/helm
helm repo update

Usage guides

vLLM

Basic usage:

# Note by default the resource limit is set to 1 GPU
helm install mistral-7b-instruct substratusai/vllm \
  --set model=mistralai/Mistral-7B-Instruct-v0.1

For Advanced usage see: vLLM Chart Guide

helm's People

Contributors

samos123 avatar nstogner avatar

Stargazers

Adam L avatar Zmu avatar Edwin M. avatar Sigrid Jin (ง'̀-'́)ง oO avatar RobinQu avatar  avatar  avatar Zakaria Laabsi avatar

Watchers

 avatar  avatar

helm's Issues

Allow for multiple models

Example values.yaml:

models:
   llama-7b: # The "model" that is referenced in the OpenAI-like API
     enabled: true
     source:
        huggingface: meta-llama/Llama-2-7b
        # gcs: ... # future
    resources:
      nvidia.com/gpu: 2
   opt-350m:
     enabled: true
     source:
        huggingface: facebook/opt-350m

404 on getting repo

helm repo add substratusai https://substratusai.github.io/helm
helm repo update
...Unable to get an update from the "substratusai" chart repository (https://substratusai.github.io/helm):
	failed to fetch https://substratusai.github.io/helm/index.yaml : 404 Not Found

vLLM chart not compatible with GKE autopilot out of the box

Perhaps we should add instructions for installing on GKE autopilot:

helm upgrade --install mistral-7b-instruct substratusai/vllm -f - << EOF
model: mistralai/Mistral-7B-Instruct-v0.1
replicaCount: 0
env:
- name: SERVED_MODEL_NAME
  value: mistral-7b-instruct-v0.1 # needs to be same as lingo model name
deploymentAnnotations:
  lingo.substratus.ai/models: mistral-7b-instruct-v0.1
  lingo.substratus.ai/min-replicas: "0" # needs to be string
  lingo.substratus.ai/max-replicas: "3" # needs to be string
EOF
Release "mistral-7b-instruct" does not exist. Installing it now.
W1118 19:41:03.659013   45860 warnings.go:70] autopilot-default-resources-mutator:Autopilot updated Deployment default/mistral-7b-instruct-vllm: adjusted resources to meet requirements for containers [vllm] (see http://g.co/gke/autopilot-resources)
Error: admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints.
Violations details: {"[denied by autogke-gpu-limitation]":["You must specify a GPU type with node selector 'cloud.google.com/gke-accelerator' when GPU is requested on Autopilot workloads; supported values are: [nvidia-a100-80gb, nvidia-tesla-a100, nvidia-tesla-t4]."]}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.