Coder Social home page Coder Social logo

aidb's Introduction

AIDB

Analyze unstructured data blazingly fast with machine learning. Connect your own ML models to your own data sources and query away!

Quick Start

In order to start using AIDB, all you need to do is install the requirements, specify a configuration, and query! Setting up on the environment is as simple as

git clone https://github.com/ddkang/aidb.git
cd aidb
pip install -r requirements.txt

# Optional if you'd like to run the examples below
gdown https://drive.google.com/uc?id=1SyHRaJNvVa7V08mw-4_Vqj7tCynRRA3x
unzip data.zip -d tests/

Text Example (in CSV)

We've set up an example of analyzing product reviews with HuggingFace. Set your HuggingFace API key. After this, all you need to do is run

python launch.py --config=config.sentiment --setup-blob-table --setup-output-table

As an example query, you can run

SELECT AVG(score)
FROM sentiment
WHERE label = '5 stars'
ERROR_TARGET 10%
CONFIDENCE 95%;

You can see the mappings here. We use the HuggingFace API to generate sentiments from the reviews.

Image Example (local directory)

We've also set up another example of analyzing whether or not user-generated content is adult content for filtering. In order to run this example, all you need to do is run

python launch.py --config=config.nsfw_detect --setup-blob-table --setup-output-table

As an example query, you can run

SELECT *
FROM nsfw
WHERE racy LIKE 'POSSIBLE';

You can see the mappings here. We use the Google Vision API to generate the safety labels.

Key Features

AIDB focuses on keeping cost down and interoperability high.

We reduce costs with our optimizations:

  • First-class support for approximate queries, reducing the cost of aggregations by up to 350x.
  • Caching, which speeds up multiple queries over the same data.

We keep interoperability high by allowing you to bring your own data source, ML models, and vector databases!

Approximate Querying

One key feature of AIDB is first-class support for approximate queries. Currently, we support approximate AVG, COUNT, and SUM. We don't currently support GROUP BY or JOIN for approximate aggregations, but it's on our roadmap. Please reach out if you'd like us to support your queries!

In order to execute an approximate aggregation query, simply append ERROR_TARGET <error percent>% CONFIDENCE <confidence>% to your normal aggregation. As a full example, you can compute an approximate count by doing:

SELECT COUNT(xmin)
FROM objects
ERROR_TARGET 5%
CONFIDENCE 95%;

The ERROR_TARGET specifies the percent error compared to running the query exactly. For example, if the true answer is 100, you will get answers between 95 and 105 (95% of the time).

Useful Links

Contribute

We have many improvements we'd like to implement. Please help us! For the time being, please email us, if you'd like to help contribute.

Contact Us

Need help in setting up AIDB for your specific dataset or want a new feature? Please fill this form.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.