RAG with Streaming LLM

This hands-on workshop, aimed at developers and solution builders, introduces how to leverage LLMs for RAG(Retrieval Augmented Generation).

In this solution,

we bring kNN to search solutions of related problems by AWS Opensearch;
Use one of LLM models to do analysis for all related content and rendering with streamed response from the LLM model.

Overview

This solution illustrate how to do semantic search across AWS AOS and utilize a LLM model to generate analysis with the following steps,

Query by keywords from Client for the solutions relating what you ask
Generate Embedding for keywords with Embedding Model
Search useful knowledge by keyword Embedding across vector DB
Return domain-specific knowledge from Vector DB
Post selected or all related information to LLM backend
Generate problem analysis and solution suggestion by LLM
Start to stream out the generated text word by word
Render the output word by word

Cost

You are responsible for the cost of the AWS services used while running this solution.

Prerequisites

Operating System

You need to

prepare an ec2 instance with x86_64 architecture(t3.xlarge is recommended) for the deployment
- install cdk in this deployment machine and get your account bootstrapped, please refer to Install the AWS CDK
- install docker in this deployment machine de and start the docker::
```
$ sudo yum install docker
$ sudo systemctl start docker
```
- make sure Python3 is installed in this instance

Third-party tools

Install Docker

In this step, you will install Docker. This is required for the next step to build a LLM docker Image and push it to Amazon ECR.

sudo yum install docker
sudo usermod -aG docker ${USER}
sudo service docker start

Verify that if you can run Docker commands without sudo.

sudo docker info

Install git

To install git, you will need to kick below command:

sudo yum install -y git

Git LFS (Large File Storage) is an open-source Git extension developed by GitHub. It is created to handle files that are large and cannot be managed easily by Git itself. We will need Git LFS to download LLM model, please refer to Install glf on AWS Ec2 to install git-lfs. Simply for most EC2 instance you can enter following commands in the notebook console,

sudo amazon-linux-extras install epel -y 
sudo yum-config-manager —enable epel
sudo yum install git-lfs -y
sudo git lfs install

Insomnia (Optional)

You can post request with AWS IAM V4 Auth to test deployed API

AWS account requirements

This deployment requires the following available in your AWS account

Required resources:

AWS S3 bucket
AWS AOS
AWS SecretsManager
AWS VPC
AWS IAM role with specific permissions
AWS SageMaker

Make sure your account can utility the above resources.

Deploy the solution

Download a LLM model

Please refer to Downloading models Using Git and configure up your SSH user settings.

cd infrastructure/docker/
git clone [email protected]:THUDM/chatglm2-6b

Prepare python env

Before you deploy this solution, be sure you have right aws credentials configured. Now you need to install deployment dependencies.

  $ cd infrastructure
  $ python3 -m venv .venv
  $ . .venv/bin/activate
  $ pip install -r requirements.txt

Install Nodejs

# Follow https://github.com/nvm-sh/nvm#installing-and-updating
# LTS version https://nodejs.org/en/about/previous-releases
nvm install <Stable LTS Version> ## install a LTS Version
nvm use <Stable LTS Version>  ## activate this Version in use

Prepare front-end and infra build

  $ cd ~/smartsearch-ai-knowledge-workshop/front-end
  $ npm install
  $ cd ~/smartsearch-ai-knowledge-workshop/infrastructure
  $ npm install

user_data in ec2

nedd to manually run and check if port 5000 is running

REGION=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
account=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .accountId)

## login
docker login --username AWS --password $(aws ecr get-login-password --region ${REGION}) ${account}.dkr.ecr.${REGION}.amazonaws.com
aws ecr get-login-password --region ${REGION} | docker login --username AWS --password-stdin ${account}.dkr.ecr.${REGION}.amazonaws.com

## pull image
docker pull ${account}.dkr.ecr.${REGION}.amazonaws.com/llm_smart_search:latest

## run the image
docker run --gpus '"device=0"' -p 5000:5000 -it -d --restart=on-failure ${account}.dkr.ecr.${REGION}.amazonaws.com/llm_smart_search:latest

docker cmd

docker images
docker ps
docker exec -it container_id /bin/bash

Then you can deploy by cdk with the following commands:

  $ cdk deploy RAGSearchWithLLMInfraStack --require-approval never
  $ cdk deploy RAGSearchWithLLMSemanticSearchLambdaStack --require-approval never
  $ cdk deploy RAGSearchWithLLMFrontendStack --require-approval never

After each command is done, the command prompt reappears. You can go to the AWS CloudFormation console and see that all three stacks: RAGSearchWithLLMInfraStack, RAGSearchWithLLMSemanticSearchLambdaStack and RAGSearchWithLLMFrontendStack.

Ingest sample data

You need to ingest some data to play with this solution. We provide a simple list of question-answer pairs. You can ingest with SageMaker Notebook and upload whole data folder into this notebook instance. Please follow the instructions in data/data_ingestion.ipynb to feed data into AWS AOS.

Test

After deployment and data ingestion, you can get an url of from RAGSearchWithLLMFrontendStack stack in output cdk.

Outputs:
RAGSearchWithLLMFrontendStack.RAGSearchWithLLMFrontendSmartSearchUrl*** = https://***.cloudfront.net

Service limits (if applicable)

The solution can handle QA pairs for summarization. You can extend it if you have other requirements.

Cleanup

Please kick cdk destroy --all to clean up the whole environment in this path infrastructure.

FAQ, known issues, additional considerations, and limitations

N/A

Revisions

N/A

Notices

During the launch of this reference architecture, you will install software (and dependencies) on the Amazon EC2 instances launched in your account via stack creation. The software packages and/or sources you will install will be from the Amazon Linux distribution and AWS Services, as well as from third party sites. Here is the list of third party software, the source link, and the license link for each software. Please review and decide your comfort with installing these before continuing.

BSD License: https://opensource.org/licenses/bsd-license.php

Historical Permission Notice and Disclaimer (HPND): https://opensource.org/licenses/HPND

MIT License: https://github.com/tsenart/vegeta/blob/master/LICENSE

Apache Software License 2.0: https://www.apache.org/licenses/LICENSE-2.0

Mozilla Public License 2.0 (MPL 2.0): https://www.mozilla.org/en-US/MPL/2.0/

ISC License: https://opensource.org/licenses/ISC

qiaow02 / smartsearch-ai-knowledge-workshop Goto Github PK

smartsearch-ai-knowledge-workshop's Introduction