Coder Social home page Coder Social logo

grofte / redis-arxiv-search Goto Github PK

View Code? Open in Web Editor NEW

This project forked from redis-developer/redis-arxiv-search

0.0 0.0 0.0 606 KB

Vector-based document retrieval demo with the arXiv paper dataset using Redis as the vector database.

Home Page: https://docsearch.redisventures.com

License: BSD 3-Clause "New" or "Revised" License

Shell 0.95% JavaScript 0.54% Python 22.32% TypeScript 20.29% Makefile 0.58% HTML 1.43% Jupyter Notebook 53.25% Dockerfile 0.64%

redis-arxiv-search's Introduction

Redis arXiv Search

This repository is the official codebase for the arxiv paper search app hosted at: https://docsearch.redisventures.com

Through the RediSearch module, vector data types and search indexes can be added to Redis. This turns Redis into a highly performant, in-memory, vector database, which can be used for many types of applications.


Here we showcase Redis vector similarity search (VSS) applied to a document search/retrieval use case. Read more about AI-powered search in our blog post (shout out to our friends at Data Science Dojo).

Screen Shot 2022-09-20 at 12 20 16 PM

Getting Started

The steps below outline how to get this app up and running on your machine.

Docker

Install Docker Desktop.

Download arXiv Dataset

Pull the arXiv dataset from the the following Kaggle link.

Download and extract the zip file and place the resulting json file (arxiv-metadata-oai-snapshot.json) in the data/ directory.

Embedding Creation

1. Setup python environment:

  • If you use conda, take advantage of the Makefile included here: make env
  • Otherwise, setup your virtual env however you wish and install python deps in requirements.txt

2. Use the notebook:

Application

This app was built as a Single Page Application (SPA) with the following components:

Some inspiration was taken from this Cookiecutter project and turned into a SPA application instead of a separate front-end server approach.

Launch

To launch app, run the following:

  • docker compose up from the same directory as docker-compose.yml
  • Navigate to http://localhost:8888 in a browser

Building the containers manually:

The first time you run docker compose up it will automatically build your Docker images based on the Dockerfile. However, in future passes when you need to rebuild, simply run: docker compose up --build to force a new build.

Using a React dev env

It's typically easier to manipulate front end code in an interactive environment (outside of Docker) where one can test out code changes in real time. In order to use this approach:

  1. Follow steps from previous section with Docker Compose to deploy the backend API.
  2. cd gui/ directory and use yarn to install packages: yarn install --no-optional (you may need to use npm to install yarn).
  3. Use yarn to serve the application from your machine: yarn start.
  4. Navigate to http://localhost:3000 in a browser.
  5. Make front end changes in realtime.

Troubleshooting

  • Issues with Docker? Run docker system prune, restart Docker Desktop, and try again.
  • Open an issue here on GitHub and we will be as responsive as we can!

Interested in contributing?

This is a new project. Comment on an open issue or create a new one. We can triage it from there.

redis-arxiv-search's People

Contributors

tylerhutcherson avatar grofte avatar spartee avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.