Coder Social home page Coder Social logo

networkninja / rag_api Goto Github PK

View Code? Open in Web Editor NEW

This project forked from danny-avila/rag_api

0.0 0.0 1.0 62 KB

ID-based RAG FastAPI: Integration with Langchain and PostgreSQL/pgvector

Home Page: https://librechat.ai/

Python 97.49% Dockerfile 0.63% Shell 1.88%

rag_api's Introduction

ID-based RAG FastAPI

Overview

This project integrates Langchain with FastAPI in an Asynchronous, Scalable manner, providing a framework for document indexing and retrieval, using PostgreSQL/pgvector.

Files are organized into embeddings by file_id. The primary use case is for integration with LibreChat, but this simple API can be used for any ID-based use case.

The main reason to use the ID approach is to work with embeddings on a file-level. This makes for targeted queries when combined with file metadata stored in a database, such as is done by LibreChat.

The API will evolve over time to employ different querying/re-ranking methods, embedding models, and vector stores.

Features

  • Document Management: Methods for adding, retrieving, and deleting documents.
  • Vector Store: Utilizes Langchain's vector store for efficient document retrieval.
  • Asynchronous Support: Offers async operations for enhanced performance.

Setup

Getting Started

  • Configure .env file based on section below
  • Setup pgvector database:
    • Run an existing PSQL/PGVector setup, or,
    • Docker: docker compose up (also starts RAG API)
      • or, use docker just for DB: docker compose -f ./db-compose.yaml up
  • Run API:
    • Docker: docker compose up (also starts PSQL/pgvector)
      • or, use docker just for RAG API: docker compose -f ./api-compose.yaml up
    • Local:
      • Make sure to setup DB_HOST to the correct database hostname
      • Run the following commands (preferably in a virtual environment)
pip install -r requirements.txt
uvicorn main:app

Environment Variables

The following environment variables are required to run the application:

  • RAG_OPENAI_API_KEY: The API key for OpenAI API Embeddings (if using default settings).

    • Note: OPENAI_API_KEY will work but RAG_OPENAI_API_KEY will override it in order to not conflict with LibreChat setting.
  • RAG_OPENAI_BASEURL: (Optional) The base URL for your OpenAI API Embeddings

  • RAG_OPENAI_PROXY: (Optional) Proxy for OpenAI API Embeddings

  • VECTOR_DB_TYPE: (Optional) select vector database type, default to pgvector.

  • POSTGRES_DB: (Optional) The name of the PostgreSQL database, used when VECTOR_DB_TYPE=pgvector.

  • POSTGRES_USER: (Optional) The username for connecting to the PostgreSQL database.

  • POSTGRES_PASSWORD: (Optional) The password for connecting to the PostgreSQL database.

  • DB_HOST: (Optional) The hostname or IP address of the PostgreSQL database server.

  • DB_PORT: (Optional) The port number of the PostgreSQL database server.

  • RAG_HOST: (Optional) The hostname or IP address where the API server will run. Defaults to "0.0.0.0"

  • RAG_PORT: (Optional) The port number where the API server will run. Defaults to port 8000.

  • JWT_SECRET: (Optional) The secret key used for verifying JWT tokens for requests.

    • The secret is only used for verification. This basic approach assumes a signed JWT from elsewhere.
    • Omit to run API without requiring authentication
  • COLLECTION_NAME: (Optional) The name of the collection in the vector store. Default value is "testcollection".

  • CHUNK_SIZE: (Optional) The size of the chunks for text processing. Default value is "1500".

  • CHUNK_OVERLAP: (Optional) The overlap between chunks during text processing. Default value is "100".

  • RAG_UPLOAD_DIR: (Optional) The directory where uploaded files are stored. Default value is "./uploads/".

  • PDF_EXTRACT_IMAGES: (Optional) A boolean value indicating whether to extract images from PDF files. Default value is "False".

  • DEBUG_RAG_API: (Optional) Set to "True" to show more verbose logging output in the server console, and to enable postgresql database routes

  • CONSOLE_JSON: (Optional) Set to "True" to log as json for Cloud Logging aggregations

  • EMBEDDINGS_PROVIDER: (Optional) either "openai", "azure", "huggingface", "huggingfacetei" or "ollama", where "huggingface" uses sentence_transformers; defaults to "openai"

  • EMBEDDINGS_MODEL: (Optional) Set a valid embeddings model to use from the configured provider.

    • Defaults
    • openai: "text-embedding-3-small"
    • azure: "text-embedding-3-small" (will be used as your Azure Deployment)
    • huggingface: "sentence-transformers/all-MiniLM-L6-v2"
    • huggingfacetei: "http://huggingfacetei:3000". Hugging Face TEI uses model defined on TEI service launch.
    • ollama: "nomic-embed-text"
  • RAG_AZURE_OPENAI_API_VERSION: (Optional) Default is 2023-05-15. The version of the Azure OpenAI API.

  • RAG_AZURE_OPENAI_API_KEY: (Optional) The API key for Azure OpenAI service.

    • Note: AZURE_OPENAI_API_KEY will work but RAG_AZURE_OPENAI_API_KEY will override it in order to not conflict with LibreChat setting.
  • RAG_AZURE_OPENAI_ENDPOINT: (Optional) The endpoint URL for Azure OpenAI service, including the resource.

    • Example: https://YOUR_RESOURCE_NAME.openai.azure.com.
    • Note: AZURE_OPENAI_ENDPOINT will work but RAG_AZURE_OPENAI_ENDPOINT will override it in order to not conflict with LibreChat setting.
  • HF_TOKEN: (Optional) if needed for huggingface option.

  • OLLAMA_BASE_URL: (Optional) defaults to http://ollama:11434.

Make sure to set these environment variables before running the application. You can set them in a .env file or as system environment variables.

Use Atlas MongoDB as Vector Database

Instead of using the default pgvector, we could use Atlas MongoDB as the vector database. To do so, set the following environment variables

VECTOR_DB_TYPE=atlas-mongo
ATLAS_MONGO_DB_URI=<mongodb+srv://...>
MONGO_VECTOR_COLLECTION=<collection name>

The ATLAS_MONGO_DB_URI could be the same or different from what is used by LibreChat. Even if it is the same, the $MONGO_VECTOR_COLLECTION collection needs to be a completely new one, separate from all collections used by LibreChat. In additional, create a vector search index for $MONGO_VECTOR_COLLECTION with the following json:

{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    },
    {
      "path": "file_id",
      "type": "filter"
    }
  ]
}

Follw one of the four documented methods to create the vector index.

Cloud Installation Settings:

AWS:

Make sure your RDS Postgres instance adheres to this requirement:

The pgvector extension version 0.5.0 is available on database instances in Amazon RDS running PostgreSQL 15.4-R2 and higher, 14.9-R2 and higher, 13.12-R2 and higher, and 12.16-R2 and higher in all applicable AWS Regions, including the AWS GovCloud (US) Regions.

In order to setup RDS Postgres with RAG API, you can follow these steps:

  • Create a RDS Instance/Cluster using the provided AWS Documentation.

  • Login to the RDS Cluster using the Endpoint connection string from the RDS Console or from your IaC Solution output.

  • The login is via the Master User.

  • Create a dedicated database for rag_api: create database rag_api;.

  • Create a dedicated user\role for that database: create role rag;

  • Switch to the database you just created: \c rag_api

  • Enable the Vector extension: create extension vector;

  • Use the documentation provided above to set up the connection string to the RDS Postgres Instance\Cluster.

Notes:

  • Even though you're logging with a Master user, it doesn't have all the super user privileges, that's why we cannot use the command: create role x with superuser;
  • If you do not enable the extension, rag_api service will throw an error that it cannot create the extension due to the note above.

Dev notes:

Installing pre-commit formatter

Run the following commands to install pre-commit formatter, which uses black code formatter:

pip install pre-commit
pre-commit install

rag_api's People

Contributors

danny-avila avatar bnord01 avatar fakamoto avatar jinzishuai avatar denispalnitsky avatar idachev avatar bsu3338 avatar tip-dteller avatar

Forkers

wmjg-alt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.