Coder Social home page Coder Social logo

zimingyuan / cognitive-storage-based-data-semantic-management-and-generation-system Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 52.36 MB

This system utilizes the Ryzen AI processor to accelerate the image-to-text model BLIP, the embedding extraction model BGE, and the large language model Qwen for slide generation, achieving a semantic-based intelligent file management and generation system on AI PCs with AMD Ryzen AI.

Python 43.14% Cuda 6.71% C 3.81% C++ 37.63% Jupyter Notebook 5.20% Shell 0.41% CMake 1.71% Batchfile 1.41%

cognitive-storage-based-data-semantic-management-and-generation-system's Introduction

Cognitive Storage-based Data Semantic Management and Generation System

This system utilizes the Ryzen AI processor to accelerate the image-to-text model BLIP, the embedding extraction model BGE, and the large language model Qwen for slide generation, achieving a semantic-based intelligent file management and generation system on AI PCs with AMD Ryzen AI. This project participated in the PC AI track of the AMD Pervasive AI Developer Contest 2023.

Setup

Step1: Install NPU Driver as Ryzen AI Software documentation. Remember to restart the terminal after installation to update the PATH environment variable.

Step2: Using conda to setup the environment:

conda env create --file=env.yaml
conda activate cognitive-storage-system

Step3: Install QLinear:

setup.bat
pip install ops\cpp --force-reinstall

Usage

python frontend.py

Every time when activate the conda environment, setup.bat should be executed first. The cognitive storage system support 3 operations:

  • Embedding generation. User can select a directory , and then enter the maximum number of characters for each segment after the document file is segmented and the total number of segments to be processed. Then, all supported file types under this folder can be segmented and the embedding can be calculated. The supported file types include: pdf, docx, doc, pptx, ppt, xlsx, csv, html, xml, md, txt, png, jpg, bmp, webp. Note that for image files, the description text is first generated using the image-to-text model and then the embedding is generated. The generated index file embeddings.index and the file list file_lists.index are stored under the selected folder.
  • File searching. Users can select a directory for which the index has been generated, enter the prompt words and the maximum number of results, and the system will search for the files the user wants based on the semantics of the prompt words.
  • Slide generation. For the list of files found through the search, users can input the maximum number of characters for each segment after the document file is segmented and the total number of segments to be processed. The system will automatically read and segment these files, and summarize the file contents into Markdown documents and slides in pptx format.

cognitive-storage-based-data-semantic-management-and-generation-system's People

Contributors

zimingyuan avatar

Stargazers

Jeremy Song avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.