Coder Social home page Coder Social logo

hv0905 / nekoimagegallery Goto Github PK

View Code? Open in Web Editor NEW
33.0 3.0 3.0 7.02 MB

An AI-powered natural language & reverse Image Search Engine powered by CLIP & qdrant.

Home Page: https://image-insights.edgeneko.com/

Python 98.47% Dockerfile 1.53%
computer-vision image-search image-search-engine search-engine transformers clip

nekoimagegallery's Introduction

NekoImageGallery

GitHub Workflow Status (with event) Man hours Docker Image Size (tag) Docker Image Size (tag)

An online AI image search engine based on the Clip model and Qdrant vector database. Supports keyword search and similar image search.

中文文档

✨ Features

  • Use the Clip model to generate 768-dimensional vectors for each image as the basis for search. No need for manual annotation or classification, unlimited classification categories.
  • OCR Text search is supported, use PaddleOCR to extract text from images and use BERT to generate text vectors for search.
  • Use Qdrant vector database for efficient vector search.

📷Screenshots

Screenshot1 Screenshot2 Screenshot3 Screenshot4 Screenshot5 Screenshot6

The above screenshots may contain copyrighted images from different artists, please do not use them for other purposes.

✈️ Deployment

Local Deployment

Deploy Qdrant Database

Please deploy the Qdrant database according to the Qdrant documentation. It is recommended to use Docker for deployment.

If you don't want to deploy Qdrant yourself, you can use the online service provided by Qdrant.

Deploy NekoImageGallery

  1. Clone the project directory to your own PC or server.

  2. It is highly recommended to install the dependencies required for this project in a Python venv virtual environment. Run the following command:

    python -m venv .venv
    . .venv/bin/activate
  3. Install PyTorch. Follow the PyTorch documentation to install the torch version suitable for your system using pip.

    If you want to use CUDA acceleration for inference, be sure to install a CUDA-supported PyTorch version in this step. After installation, you can use torch.cuda.is_available() to confirm whether CUDA is available.

  4. Install other dependencies required for this project:

    pip install -r requirements.txt
  5. Modify the project configuration file inside config/, you can edit default.env directly, but it's recommended to create a new file named local.env and override the configuration in default.env.

  6. Initialize the Qdrant database by running the following command:

    python main.py --init-database

    This operation will create a collection in the Qdrant database with the same name as config.QDRANT_COLL to store image vectors.

  7. (Optional) In development deployment and small-scale deployment, you can use the built-in static file indexing and service functions of this application. Use the following command to index your local image directory:

    python main.py --local-index <path-to-your-image-directory>

    This operation will copy all image files in the <path-to-your-image-directory> directory to the config.STATIC_FILE_PATH directory (default is ./static) and write the image information to the Qdrant database.

    Then run the following command to generate thumbnails for all images in the static directory:

      python main.py --local-create-thumbnail

    If you want to deploy on a large scale, you can use OSS storage services like MinIO to store image files in OSS and then write the image information to the Qdrant database.

  8. Run this application:

    python main.py

    You can use --host to specify the IP address you want to bind to (default is 0.0.0.0) and --port to specify the port you want to bind to (default is 8000).

  9. (Optional) Deploy the front-end application: NekoImageGallery.App is a simple web front-end application for this project. If you want to deploy it, please refer to its deployment documentation.

Docker Compose Containerized Deployment

Warning

Docker compose support is in an alpha state, and may not work for everyone(especially CUDA acceleration). Please make sure you are familiar with Docker documentation before using this deployment method. If you encounter any problems during deployment, please submit an issue.

Prepare nvidia-container-runtime

If you want to use CUDA acceleration, you need to install nvidia-container-runtime on your system. Please refer to the official documentation for installation.

Related Document:

  1. https://docs.docker.com/config/containers/resource_constraints/#gpu
  2. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
  3. https://nvidia.github.io/nvidia-container-runtime/

Run the server

  1. Download the docker-compose.yml file from repository.
    # For cuda deployment (default)
    wget https://raw.githubusercontent.com/hv0905/NekoImageGallery/master/docker-compose.yml
    # For CPU-only deployment
    wget https://raw.githubusercontent.com/hv0905/NekoImageGallery/master/docker-compose-cpu.yml && mv docker-compose-cpu.yml docker-compose.yml
  2. Modify the docker-compose.yml file as needed
  3. Run the following command to start the server:
    # start in foreground
    docker compose up
    # start in background(detached mode)
    docker compose up -d

📚 API Documentation

The API documentation is provided by FastAPI's built-in Swagger UI. You can access the API documentation by visiting the /docs or /redoc path of the server.

⚡ Related Project

Those project works with NekoImageGallery :D

NekoImageGallery.App LiteLoaderQQNT-NekoImageGallerySearch nonebot-plugin-nekoimage

📊 Repository Summary

Alt

♥ Contributing

There are many ways to contribute to the project: logging bugs, submitting pull requests, reporting issues, and creating suggestions.

Even if you with push access on the repository, you should create a personal feature branches when you need them. This keeps the main repository clean and your workflow cruft out of sight.

We're also interested in your feedback on the future of this project. You can submit a suggestion or feature request through the issue tracker. To make this process more effective, we're asking that these include more information to help define them more clearly.

Copyright

Copyright 2023 EdgeNeko

Licensed under GPLv3 license.

nekoimagegallery's People

Contributors

hv0905 avatar pk5ls20 avatar anduin2017 avatar

Stargazers

清羽Victor avatar  avatar  avatar  avatar ScrapW avatar  avatar  avatar Tasuku Bobcorn avatar WXS avatar Yaroslav avatar  avatar 杋柒 avatar lithium avatar VHumFF avatar Nanako avatar  avatar linyuchen avatar  avatar  avatar Théophile Fréger avatar xy_cloud avatar shiina-mashiro avatar NIEDASEN avatar  avatar 泠潇丶苏ゆき avatar  avatar  avatar ictye avatar 卡比比酱 avatar Stefan avatar  avatar  avatar  avatar

Watchers

Kostas Georgiou avatar  avatar  avatar

nekoimagegallery's Issues

[Discussion] What's the best way for matching OCR text?

Currently we use BERT model (more precisely, bert-base-chinese) to vectorize OCR text, then use COSINE distance for indexing and searching.

However, this method seems to have low performance when processing partial keywords or semantically similar sentences.

For instance,
image

FYI, the OCR text of the image:
1. please
2. 你最
3. 叔
4. 什么情况兄弟
5. 爱
6. 爱
7. 害怕
8. 乳
9. 嘿

And only when I provide more detailed text, the server can return some more accurate result:
image

Any solution to improve the OCR text matching?

Related code

https://github.com/hv0905/NekoImageGallery/blob/master/app/Services/transformers_service.py#L59

Related documentation

https://huggingface.co/tasks/sentence-similarity

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.