Coder Social home page Coder Social logo

ibl-ai-neural-startup's Introduction

ibl-ai-neural-startup

This project presents different ways to run llms in production

Note

Openllm with gpu support is the preferred means of running llms for high throughput.

OpenLLM

OpenLLM exposes an intuitive api on top of backends like vllm, ctranslate, etc. It scales very well and has very good integration with langchain through langchain's OpenLLM llm class.

Three approaches to deploying with openllm are presented in the openllm directory.

  1. deploy.sh: Run openllm directly on the host device. This checks to ensure that python3 and openllm packages are installed. You must always pass a model's repo_id from huggingface as input parameter.
  2. docker-deploy-no-gpu.py: This runs opennl model in a non-gpu container.
  3. docker-deploy-gpu.py: This runs openllm in docker with gpu support. nvidia-smi driver must be configured from the aws instance.

api

The api module presents a mini fastapi app that expoises endpoints for inference. This has only minimal features.

usage

You can either run the project directly

cd api
pip install -r requirements.txt
uvicorn app:app

Or run with docker using

cd api
docker build . -t api
docker run --rm -d --gpus all -p 3000:3000  api

You may run without gpus but at a performance penalty.e

ibl-ai-neural-startup's People

Contributors

joetib avatar

Watchers

 avatar Miguel Amigot avatar David CB (nunpa) avatar Bhargav Soni avatar Ngabidong Brian avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.