Coder Social home page Coder Social logo

pi314ever / tei-gaudi Goto Github PK

View Code? Open in Web Editor NEW

This project forked from huggingface/tei-gaudi

0.0 0.0 0.0 1.14 MB

A blazing fast inference solution for text embeddings models

Home Page: https://huggingface.co/docs/text-embeddings-inference/quick_tour

License: Apache License 2.0

Shell 0.55% JavaScript 1.92% Python 4.82% Rust 91.83% Makefile 0.20% Dockerfile 0.68%

tei-gaudi's Introduction

Text Embeddings Inference on Habana Gaudi

To use ๐Ÿค— text-embeddings-inference on Habana Gaudi/Gaudi2, follow these steps:

  1. Pull the official Docker image with:
    docker pull ghcr.io/huggingface/tei-gaudi:latest

Note

Alternatively, you can build the Docker image using Dockerfile-hpu located in this folder with:

docker build -f Dockerfile-hpu -t tei_gaudi .
  1. Launch a local server instance on 1 Gaudi card:
    model=BAAI/bge-large-en-v1.5
    volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
    
    docker run -p 8080:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e MAX_WARMUP_SEQUENCE_LENGTH=512 --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tei-gaudi:latest --model-id $model --pooling cls
  2. You can then send a request:
     curl 127.0.0.1:8080/embed \
         -X POST \
         -d '{"inputs":"What is Deep Learning?"}' \
         -H 'Content-Type: application/json'

For more information and documentation about Text Embeddings Inference, checkout the README of the original repo.

Not all features of TEI are currently supported as this is still a work in progress.

Validated Models

Architecture Model Type Models
BERT Embedding
  • BAAI/bge-large-en-v1.5
  • sentence-transformers/all-MiniLM-L6-v2
  • sentence-transformers/all-MiniLM-L12-v2
  • sentence-transformers/multi-qa-MiniLM-L6-cos-v1
  • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
  • sentence-transformers/paraphrase-MiniLM-L3-v2
  • MPNet Embedding
  • sentence-transformers/all-mpnet-base-v2
  • sentence-transformers/paraphrase-multilingual-mpnet-base-v2
  • sentence-transformers/multi-qa-mpnet-base-dot-v1
  • ALBERT Embedding
  • sentence-transformers/paraphrase-albert-small-v2
  • Mistral Embedding
  • intfloat/e5-mistral-7b-instruct
  • Salesforce/SFR-Embedding-2_R
  • GTE Embedding
  • Alibaba-NLP/gte-large-en-v1.5
  • JinaBERT Embedding
  • jinaai/jina-embeddings-v2-base-en
  • The license to use TEI on Habana Gaudi is the one of TEI: https://github.com/huggingface/text-embeddings-inference/blob/main/LICENSE

    Please reach out to [email protected] if you have any question.

    tei-gaudi's People

    Contributors

    olivierdehaene avatar kaixuanliu avatar regisss avatar pi314ever avatar mkhalusova avatar glegendre01 avatar haixiw avatar kozistr avatar libinta avatar ltowarek avatar somehowchris avatar iandoe avatar scriptator avatar jgalego avatar jpbalarini avatar kevinhu avatar kir-gadjello avatar mcpatate avatar lysandrejik avatar marcusdunn avatar michaelfeil avatar osanseviero avatar patricebechard avatar philschmid avatar ucyang avatar zhangfand avatar drbh avatar fxmarty avatar plaggy avatar

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google โค๏ธ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.