Coder Social home page Coder Social logo

pingcap / tidb.ai Goto Github PK

View Code? Open in Web Editor NEW
31.0 11.0 4.0 1.88 MB

This is an GraphRAG - Knowledge Graph based and out-of-the-box conversational search tool that leverages the vector storage capabilities of TiDB Serverless. It provides a seamless way to embed a powerful question-answering (QA) bot directly on your website, requiring only a simple copy-and-paste of a JavaScript snippet. Demo: https://tidb.ai

Home Page: https://tidb.ai

License: Apache License 2.0

TypeScript 98.13% JavaScript 0.33% MDX 1.02% CSS 0.34% HTML 0.05% Dockerfile 0.13%
mysql rag serverless vector-database chatbot graphrag knowledge-graph

tidb.ai's Introduction

TiDB.AI

TiDB.AI

Introduction

A [WIP] conversational search RAG (Retrieval-Augmented Generation) app based on TiDB Serverless Vector Storage, providing a out-of-the-box and embeddable QA robot experience based on your knowledge on official and documentation sites.

Live Demo: TiDB.AI

With this tool, you can achieve:

Features

  1. Perplexity-style Conversational Search page: Our platform features an advanced built-in website crawler, designed to elevate your browsing experience. This crawler effortlessly navigates official and documentation sites, ensuring comprehensive coverage and streamlined search processes through sitemap URL scraping.

out-of-box-conversational-search

  1. Embeddable JavaScript Snippet: Integrate our conversational search window effortlessly into your website by copying and embedding a simple JavaScript code snippet. This widget, typically placed at the bottom right corner of your site, facilitates instant responses to product-related queries.

embeddable-javascript-snippet

Quick Start [Working in Progress]

To deploy the application in a self-hosted environment, run the following command:

curl https://tidb.cloud/install.sh | sh

then:

ticloud create-app --template rag

Deployment [TODO]

For deploying the application to production, there are many options available:

Tech Stack

License

TiDB.AI is open-source under the Apache License, Version 2.0. You can find it here.

tidb.ai's People

Contributors

634750802 avatar sykp241095 avatar mini256 avatar shczhen avatar wd0517 avatar

Stargazers

godlaugh avatar Akash avatar Kim avatar  avatar  avatar world4jason avatar Zinnia avatar  avatar  avatar  avatar  avatar Juexiao Zhou avatar USAGI avatar Bruno Wego avatar 诸岳 avatar CharlesCheung avatar  avatar Shawn Yan avatar dbant avatar Alberto Ferrer avatar song avatar long.sun avatar Yinzuo Jiang avatar goroutine avatar Jonathan Whittle  avatar  avatar PLin2023 avatar xieydd avatar  avatar winkyao avatar Ning Sun avatar

Watchers

dongxu avatar goroutine avatar siddontang avatar Li Shen avatar cuiqiu avatar iamxy avatar winkyao avatar Ian avatar  avatar Jinpeng Zhang avatar  avatar

tidb.ai's Issues

milestone 1: eating our own dog food

Background

As we develop Vector Search in TiDB Serverless, we'll build an app to test how easy it is to use. This app will provide answers to TiDB usage questions on our official websites using our in-progress vector storage in TiDB Serverless.

Todo list

  • core rag logic
    • data source management
      • upload pdf/markdown/csv etc.
      • crawl sitemap.xml of a domain
    • ui/ux
      • conversational search
        • chat history mgmt after login
      • embeddable js
  • system settings
    • overview: statistics of chats & docs
    • basic info settings
      • logo
      • site name
      • search title & subtitle
      • example questions, max to 4
      • footer links
      • GitHub / Discord etc. social media links top-right
    • oauth configurations
      • GitHub
      • Gmail
    • (pre|post| prompt settings
    • rag loader & spliter configruations: chunk size, overlap etc.
  • Use llamaindex-ts as RAG engine
  • add widget on the bottom-right of (www|ask).pingcap.com to answer questions about tidb usage / use cases

Support cost statistics

Calculate the cost of RAG in calling various third-party services, like:

  • Serverless Function (Vercel)
  • Storage
  • Embedding
  • Rerank
  • LLM Completion
  • Database Query (TiDB Serverless)

Support Bitdeer as provider

Description

Bitdeer is a cloud computing platform that provides computing power for cryptocurrency mining.

In the AI age, Bitdeer also provides multiple managed AI models (For example, llama2, mistral, etc.) API services.

Tasks

Wrong citation

[^2]: [TiDB Official Documentation](https://tidb.net/blog/ce4e4dd6)

[draft] milestone 2: private beta

Background

For our next milestone, we'll focus on making our app easier to deploy and use:

Product Refinement:

  • Dedicate efforts to polishing the product, ensuring it meets user expectations in usability.

Deployment:

  • Add support for deploying on Vercel(maybe and Cloudflare).
  • Make it possible to deploy locally using tools like npx for quick testing.

Content Updates:

  • Create a welcoming landing page for new visitors.
  • Write usage docs/api docs to help users get started and integrate.

Todo list

  • #80
  • product
    • api & api token mgmt
    • more LLM support(maybe need to re-split/re-index/re-embedding the content of docs)
    • support more data source: word etc.
    • Role-based access control & mgmt(user, admin)
  • deployment
    • manually deployment(with npx etc.)
    • docker
    • Vercel
    • Cloudflare(optional)
    • fly.io(optional)
    • LLM/tokens usage statistic
  • security
    • cross-domain white list
  • content
    • landing page
      • /
      • /showcases
      • /docs/get-started
      • /docs/api
    • README.md

Fix /home style on small screen

For title&subtitle, we can make the width shorter and wrap the text.

For docs in footer, we can make it one line per document, like this:

Line1: Docs
Line2: Another Link
Line3: Linkeeee4

[Draft] Implement an incremental crawler

src/core/interface.ts

export namespace rag {
  export interface Content<ContentMetadata> {
    content: string[];
    digest: string;
+  lastModifiedAt: Date;
    metadata: ContentMetadata;
  }

  export type ImportSourceTaskResult = {
    enqueue?: Array<{ type: string, url: string }>
    content?: {
      buffer: Buffer
      mime: string
    }
+   incrementalState?: unknown
  }

  export abstract class ImportSourceTaskProcessor<Options> extends Base<Options> {
    abstract support (taskType: string, url: string): boolean;
    abstract process (task: { url: string }): Promise<ImportSourceTaskResult>

+   abstract supportIncremental (taskType: string, url: string): boolean;
+   abstract processIncremental (previousState: unknown): Promise<ImportSourceTaskResult>
  }
}

src/core/db/importSource.ts

export interface ImportSource {
  created_at: Date;
  filter: string | null;
  filter_runtime: string | null;
  id: string;
  type: string;
  url: string;
+ incremental_state: JSON | null;
+ last_scheduled_at: Date | null;
}

[Never Close] Image Storage

We paste images into this GitHub Issue, then retrieve the image URL for use in displaying images in the README.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.