Coder Social home page Coder Social logo

qiaow02 / smartsearch-ai-knowledge-workshop Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aws-samples/smartsearch-ai-knowledge-workshop

0.0 0.0 0.0 635 KB

License: Apache License 2.0

Shell 1.42% JavaScript 27.43% Python 50.95% CSS 1.19% HTML 1.43% Jupyter Notebook 14.48% Dockerfile 1.90% SCSS 1.20%

smartsearch-ai-knowledge-workshop's Introduction

RAG with Streaming LLM

This hands-on workshop, aimed at developers and solution builders, introduces how to leverage LLMs for RAG(Retrieval Augmented Generation).

In this solution,

  • we bring kNN to search solutions of related problems by AWS Opensearch;
  • Use one of LLM models to do analysis for all related content and rendering with streamed response from the LLM model.

Overview

Architecture This solution illustrate how to do semantic search across AWS AOS and utilize a LLM model to generate analysis with the following steps,

  1. Query by keywords from Client for the solutions relating what you ask
  2. Generate Embedding for keywords with Embedding Model
  3. Search useful knowledge by keyword Embedding across vector DB
  4. Return domain-specific knowledge from Vector DB
  5. Post selected or all related information to LLM backend
  6. Generate problem analysis and solution suggestion by LLM
  7. Start to stream out the generated text word by word
  8. Render the output word by word

Cost

You are responsible for the cost of the AWS services used while running this solution.

Prerequisites

Operating System

You need to

  • prepare an ec2 instance with x86_64 architecture(t3.xlarge is recommended) for the deployment
    • install cdk in this deployment machine and get your account bootstrapped, please refer to Install the AWS CDK
    • install docker in this deployment machine de and start the docker::
      $ sudo yum install docker
      $ sudo systemctl start docker
    • make sure Python3 is installed in this instance

Third-party tools

Install Docker

In this step, you will install Docker. This is required for the next step to build a LLM docker Image and push it to Amazon ECR.

sudo yum install docker
sudo usermod -aG docker ${USER}
sudo service docker start

Verify that if you can run Docker commands without sudo.

sudo docker info

Install git

To install git, you will need to kick below command:

sudo yum install -y git

Git LFS (Large File Storage) is an open-source Git extension developed by GitHub. It is created to handle files that are large and cannot be managed easily by Git itself. We will need Git LFS to download LLM model, please refer to Install glf on AWS Ec2 to install git-lfs. Simply for most EC2 instance you can enter following commands in the notebook console,

sudo amazon-linux-extras install epel -y 
sudo yum-config-manager —enable epel
sudo yum install git-lfs -y
sudo git lfs install

Insomnia (Optional)

You can post request with AWS IAM V4 Auth to test deployed API

AWS account requirements

This deployment requires the following available in your AWS account

Required resources:

  • AWS S3 bucket
  • AWS AOS
  • AWS SecretsManager
  • AWS VPC
  • AWS IAM role with specific permissions
  • AWS SageMaker

Make sure your account can utility the above resources.

Deploy the solution

Download a LLM model

Please refer to Downloading models Using Git and configure up your SSH user settings.

cd infrastructure/docker/
git clone [email protected]:THUDM/chatglm2-6b

Prepare python env

Before you deploy this solution, be sure you have right aws credentials configured. Now you need to install deployment dependencies.

  $ cd infrastructure
  $ python3 -m venv .venv
  $ . .venv/bin/activate
  $ pip install -r requirements.txt

Install Nodejs

# Follow https://github.com/nvm-sh/nvm#installing-and-updating
# LTS version https://nodejs.org/en/about/previous-releases
nvm install <Stable LTS Version> ## install a LTS Version
nvm use <Stable LTS Version>  ## activate this Version in use

Prepare front-end and infra build

  $ cd ~/smartsearch-ai-knowledge-workshop/front-end
  $ npm install
  $ cd ~/smartsearch-ai-knowledge-workshop/infrastructure
  $ npm install

user_data in ec2

nedd to manually run and check if port 5000 is running

REGION=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
account=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .accountId)

## login
docker login --username AWS --password $(aws ecr get-login-password --region ${REGION}) ${account}.dkr.ecr.${REGION}.amazonaws.com
aws ecr get-login-password --region ${REGION} | docker login --username AWS --password-stdin ${account}.dkr.ecr.${REGION}.amazonaws.com

## pull image
docker pull ${account}.dkr.ecr.${REGION}.amazonaws.com/llm_smart_search:latest

## run the image
docker run --gpus '"device=0"' -p 5000:5000 -it -d --restart=on-failure ${account}.dkr.ecr.${REGION}.amazonaws.com/llm_smart_search:latest

docker cmd

docker images
docker ps
docker exec -it container_id /bin/bash

Then you can deploy by cdk with the following commands:

  $ cdk deploy RAGSearchWithLLMInfraStack --require-approval never
  $ cdk deploy RAGSearchWithLLMSemanticSearchLambdaStack --require-approval never
  $ cdk deploy RAGSearchWithLLMFrontendStack --require-approval never

After each command is done, the command prompt reappears. You can go to the AWS CloudFormation console and see that all three stacks: RAGSearchWithLLMInfraStack, RAGSearchWithLLMSemanticSearchLambdaStack and RAGSearchWithLLMFrontendStack.

Ingest sample data

You need to ingest some data to play with this solution. We provide a simple list of question-answer pairs. You can ingest with SageMaker Notebook and upload whole data folder into this notebook instance. Please follow the instructions in data/data_ingestion.ipynb to feed data into AWS AOS.

Test

After deployment and data ingestion, you can get an url of from RAGSearchWithLLMFrontendStack stack in output cdk.

Outputs:
RAGSearchWithLLMFrontendStack.RAGSearchWithLLMFrontendSmartSearchUrl*** = https://***.cloudfront.net

Service limits (if applicable)

The solution can handle QA pairs for summarization. You can extend it if you have other requirements.

Cleanup

Please kick cdk destroy --all to clean up the whole environment in this path infrastructure.

FAQ, known issues, additional considerations, and limitations

N/A

Revisions

N/A

Notices

During the launch of this reference architecture, you will install software (and dependencies) on the Amazon EC2 instances launched in your account via stack creation. The software packages and/or sources you will install will be from the Amazon Linux distribution and AWS Services, as well as from third party sites. Here is the list of third party software, the source link, and the license link for each software. Please review and decide your comfort with installing these before continuing.

BSD License: https://opensource.org/licenses/bsd-license.php

Historical Permission Notice and Disclaimer (HPND): https://opensource.org/licenses/HPND

MIT License: https://github.com/tsenart/vegeta/blob/master/LICENSE

Apache Software License 2.0: https://www.apache.org/licenses/LICENSE-2.0

Mozilla Public License 2.0 (MPL 2.0): https://www.mozilla.org/en-US/MPL/2.0/

ISC License: https://opensource.org/licenses/ISC

smartsearch-ai-knowledge-workshop's People

Contributors

xavieru718 avatar qiaow02 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.