Coder Social home page Coder Social logo

rag-llm-gitops's Introduction

Document Generation Demo with LLM and RAG

Introduction

This deployment is based on validated pattern framework that uses GitOps to easily provision all operators and apps. It deploys a Chatbot application that leverages the power of Large Language Models (LLMs) in conjunction with the Retrieval-Augmented Generation (RAG) framework running on Red Hat OpenShift to generate a project proposal for a given Red Hat product.

Pre-requisites

  • Podman
  • Red Hat Openshift cluster
  • GPU Node to run Hugging Face Text Generation Inference server on Red Hat OpenShift cluster.
  • Create a fork of the rag-llm-gitops git repository.

Demo Description & Architecture

The goal of this demo is to demonstrate a Chatbot LLM application augmented with data from Red Hat product documentation running on Red Hat OpenShift. It deploys an LLM application that connects to multiple LLM providers such as OpenAI, Hugging Face, and NVIDIA NIM. The application generates a project proposal for a Red Hat product

Key Features

  • LLM Application augmented with content from Red Hat product documentation.
  • Multiple LLM providers (OpenAI, Hugging Face, NVIDIA)
  • Redis Vector Database to store embeddings of RedHat product documentation.
  • Monitoring dashboard to provide key metrics such as ratings
  • GitOps setup to deploy e2e demo (frontend / vector database / served models)

Diagram

Components deployed

  • Hugging Face Text Generation Inference Server: The pattern deploys a Hugging Face TGIS server. The server deploys meta-llama/Llama-2-7b-chat-hf model. The server will require a GPU node.
  • Redis Server: A Redis Server is deployed to store vector embeddings created from Red Hat product documentation.
  • Populate VectorDb Job: The job creates the embeddings and populates the vector database (Redis).
  • LLM Application: This is a Chatbot application that can generate a project proposal by augmenting the LLM with the Red Hat product documentation stored in vector db.
  • Prometheus: Deploys a prometheus instance to store the various metrics from the LLM application and TGIS server.
  • Grafana: Deploys Grafana application to visualize the metrics.

Deploying the demo

Following commands will take about 15-20 minutes

Validated pattern will be deployed

git clone https://github.com/<<your-username>>/rag-llm-gitops.git
cd rag-llm-gitops
oc login --token=<> --server=<> # login to Openshift cluster
podman machine start
# Copy values-secret.yaml.template to ~/values-secret-rag-llm-gitops.yaml.
# You should never check these files
# Add secrets to the values-secret.yaml that needs to be added to the vault.
cp values-secret.yaml.template ~/values-secret-rag-llm-gitops.yaml
./pattern.sh make install

1: Verify the installation

  • Login to the OpenShift web console.
  • Navigate to the Workloads --> Pods.
  • Select the rag-llm project from the drop down.
  • Following pods should be up and running.

Pods

Note: If the hf-text-generation-server is not running, make sure you have followed the steps to configure a node with GPU from the instructions provided above.

2: Launch the application

  • Navigate to routes, Networking --> Routes

    Routes

  • Click on the Location link and it should launch the application

    Application

3: Generate the proposal document

  • It will use the default provider and model configured as part of the application deployment. The default provider is a Hugging Face model server running in the OpenShift. The model server is deployed with this valdiated pattern and requires a node with GPU.

  • Enter any company name

  • Enter the product as RedHat OpenShift

  • Click the Generate button, a project proposal should be generated. The project proposal also contains the reference of the RAG content. The project proposal document can be Downloaded in the form of a PDF document.

    Routes

4: Add an OpenAI provider

You can optionally add additional providers. The application supports the following providers

  • Hugging Face Text Generation Inference Server
  • OpenAI
  • NVIDIA

Click on the Add Provider tab to add a new provider. Fill in the details and click Add Provider button. The provider should be added in the Providers dropdown uder Chatbot tab.

Routes

5: Generate the proposal document using OpenAI provider

Follow the instructions in step 3 to generate the proposal document using the OpenAI provider.

Routes

6: Rating the provider

You can provide rating to the model by clicking on the Rate the model radio button. The rating will be captured as part of the metrics and can help the company which model to deploy in prodcution.

7: Grafana Dashboard

By default, Grafana application is deployed in llm-monitoring namespace.To launch the Grafana Dashboard, follow the instructions below:

  • Grab the credentials of Grafana Application
    • Navigate to Workloads --> Secrets
    • Click on the grafana-admin-credentials and copy the GF_SECURITY_ADMIN_USER, GF_SECURITY_ADMIN_PASSWORD
  • Launch Grafana Dashboard
    • Navigate to Networking --> Routes in the llm-monitoring namespace.
    • Click on the Location link for grafana-route.
    • Enter the Grafana admin credentials.
    • Ratings are displayed for each model.

Routes

rag-llm-gitops's People

Contributors

mbaldessari avatar beekhof avatar mhjacks avatar claudiol avatar day0hero avatar sauagarwa avatar ruromero avatar dependabot[bot] avatar stocky37 avatar yukinchan avatar chadmf avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.