Coder Social home page Coder Social logo

karinmatsuyama / redis-embed-search Goto Github PK

View Code? Open in Web Editor NEW

This project forked from redis-developer/redis-arxiv-search

0.0 1.0 0.0 11.79 MB

Vector search demo with the arXiv paper dataset, HuggingFace, OpenAI, FastAPI, React, and Redis as the vector database.

Home Page: https://docsearch.redisventures.com

License: BSD 3-Clause "New" or "Revised" License

Shell 0.07% JavaScript 0.45% Python 18.10% TypeScript 17.64% HTML 1.20% Jupyter Notebook 61.95% Dockerfile 0.59%

redis-embed-search's Introduction

🔎 Redis arXiv Search

This repository is the official codebase for the arxiv paper search app hosted at: https://docsearch.redisventures.com

Through the RediSearch module, vector data types and search indexes can be added to Redis. This turns Redis into a highly performant, in-memory, vector database, which can be used for many types of applications.


Here we showcase Redis vector similarity search (VSS) applied to a document search/retrieval use case. Read more about AI-powered search in our blog post hosted at Data Science Dojo.

Demo

Application

This app was built as a Single Page Application (SPA) with the following components:

Some inspiration was taken from this Cookiecutter project and turned into a SPA application instead of a separate front-end server approach.

Embedding Providers

As a way to expose the different capabilities of embedding providers, this applications supports HuggingFace, OpenAI, and Cohere embeddings out of the box. Interested in a different embedding provider? Feel free to open a PR and make a suggested addition.

Provider Embedding Model Required?
HuggingFace sentence-transformers/all-mpnet-base-v2 Yes
OpenAI text-embedding-ada-002 Yes
Cohere small Yes

Dataset

The arXiv dataset was sourced from the the following Kaggle link.

If you wish to modify or work with your own data...download and extract the zip and place the resulting json file (arxiv-metadata-oai-snapshot.json) in the data/ directory.

🚀 Running the App

Before running the app, install Docker Desktop.

  1. To get started, make a copy of the .env.template file:
$ cp .env.template .env
  1. Add your OPENAI_API_KEY to the .env file. Need one? Get an API key at https://platform.openai.com.
  2. Add you COHERE_API_KEY to the .env file. Need one? Get an API key at https://cohere.ai.

Both Redis Stack and the application backend run with Docker Compose using pre-built containers. Choose one of the methods below based on your Redis setup.

Redis Cloud

  1. Get a Redis Cloud Database (with the RediSearch module included).

  2. Update the REDIS_HOST, REDIS_PASSWORD, and REDIS_PORT environment variables in the .env file created above.:

  3. Run the App:

    $ docker compose -f docker-cloud-redis.yml up

Redis Stack Docker

Use the provided Dockerfiles and open source containers to run the application locally:

$ docker compose -f docker-local-redis.yml up

Customizing (optional)

You can use the Jupyter Notebooks in the data/ directory to create paper embeddings and metadata. The pickled dataframes will end up stored in the data/ directory and used when creating your own container.

You can the build.sh script to create your own docker image, and then make sure to update the .yml file with the right image name if necessary.

Running with Kubernetes

If you want to use K8s instead of Docker Compose, we have some resources to help you get started.

Using a React development env

It's typically easier to write front end code in an interactive environment, testing changes in realtime.

  1. Deploy the app using steps above.
  2. Install packages (you may need to use npm to install yarn)
    $ cd frontend/
    $ yarn install --no-optional
  3. Use yarn to serve the application from your machine
    $ yarn start
  4. Navigate to http://localhost:3000 in a browser.

All changes to your local code will be reflected in your display in semi realtime.

Using a React dev env

It's typically easier to manipulate front end code in an interactive environment (outside of Docker) where one can test out code changes in real time. In order to use this approach:

  1. Follow steps from previous section with Docker Compose to deploy the backend API.
  2. cd gui/ directory and use yarn to install packages: yarn install --no-optional (you may need to use npm to install yarn).
  3. Use yarn to serve the application from your machine: yarn start.
  4. Navigate to http://localhost:3000 in a browser.
  5. Make front end changes in realtime.

Troubleshooting

Sometimes you need to clear out some Docker cached artifacts. Run docker system prune, restart Docker Desktop, and try again.

Open an issue here on GitHub and we will try to be responsive to these. Additionally, please consider contributing.

redis-embed-search's People

Contributors

tylerhutcherson avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.