Coder Social home page Coder Social logo

cassandra_ai_accelerator's Introduction

alt text

Cassandra AI Accelerator

This project is a Python application that uses a Cassandra schema to generate recommended use cases for Generative AI.

The output is a Markdown formatted report you can find in report_output It will describe the use case(s) for your schema and suggest any GenAI use cases you could add. It will then give you the exact table changes needed to implement the idea you want!

Warning: The best results are with GPT4. I've tested with other models, such as Claude 2 and Llama2. None produce results as good as GPT4. I'll continue to test models for accuracy.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

  • Python 3.6 or higher
  • pip

Depending on your use case, one of the following is needed to extract the use cases.

  • A DataStax Astra instance
  • DataStax Enterprise instance
  • Apache Cassandra® instance
  • Exported schema file

A variety of LLMs APIs are supported.

  • OpenAI
  • GCP Vertex AI
  • AWS Bedrock

Pending

  • Azure GPT
  • Ollama(local)

Installation

  1. Clone the repository:
    git clone https://github.com/pmcfadin/cassandra_ai_accelerator.git
  2. Navigate to the project directory:
    cd cassandra_ai_accelerator
  3. Install the required Python packages:
    pip install -r requirements.txt

Configuration

Before running the application, you need to set up your configuration in the settings.toml file. There are defaults but you can make any needed changes.

For sensative data, rename the example.settings.toml file into .settings.toml and change setting for your use case.

Usage

To run the application, execute the app.py script:

python app.py

Todo List

  • Allow for Local LLM usage. Privacy concerns
  • Fine-tune an LLM for this specific task (Better answers)
  • Make the use case exploration interactive for more specific use cases
  • Add more types of reports. SAI conversion. Schema optimization
  • Analyze data and suggest how data could be vectorized
  • (Stretch) Create sample code for LangChain or LlamaIndex or ...

Sample report output (GPT4)

Evaluation of the CQL schema and Generative AI Opportunities

CQL Schema Overview

The provided schema supports a video sharing and social interaction platform named "KillrVideo." This platform allows users to upload videos, comment on videos, rate videos, and receive video recommendations. The schema accommodates user management, video metadata storage, tagging, playback statistics, and user interaction (comments, ratings) with videos.

Use Cases

  1. Video Content Management: Users can upload videos with descriptions, tags, and preview images.
  2. Social Engagement: Users can comment on videos, rate them, and receive recommendations.
  3. Analytics: The platform tracks video views and ratings for analytical purposes.
  4. Personalization: Users receive recommendations based on their interactions.
  5. Community Features: Videos and comments can be tagged for easier discovery.

Application Synopsis

The application facilitated by this schema seems to be a comprehensive video-sharing platform with a strong emphasis on community engagement and content discoverability.

Generative AI Features and Data Model Enhancements

1. Video Content Summarization

Generative AI can automatically generate summaries for videos based on the video description, comments, and tags.

Data Model Changes:

  • Alter the videos table to include a summary text column.
    ALTER TABLE killrvideo.videos ADD summary text;
  • For storing and searching through summaries by similarity, create a new table with vector support for the summary text.
    CREATE TABLE killrvideo.video_summaries_vs (
      videoid uuid,
      summary text,
      summary_vector VECTOR<FLOAT, 128>,  -- Assuming a 128-dimensional embedding
      PRIMARY KEY (videoid)
    )
    WITH CLUSTERING ORDER BY (videoid ASC);

2. Advanced Video Recommendations

Improve recommendation systems by incorporating vector-based similarity search on user preferences, video descriptions, and user interactions.

Data Model Changes:

  • Incorporate vector columns in relevant tables such as video_recommendations or create a new dedicated table for vector-based recommendations.
    ALTER TABLE killrvideo.video_recommendations ADD recommendation_vector VECTOR<FLOAT, 128>;
    or, for a dedicated approach,
    CREATE TABLE killrvideo.video_recommendations_vs (
      userid uuid,
      videoid uuid,
      recommendation_vector VECTOR<FLOAT, 128>,
      PRIMARY KEY (userid, videoid)
    )
    WITH CLUSTERING ORDER BY (userid ASC);

3. Comment Sentiment Analysis

By analyzing the sentiment of the comments, the platform can better understand user engagement and filter or highlight comments based on positivity.

Data Model Changes:

  • Add a sentiment_score column to the comments_by_video and comments_by_user tables.
    ALTER TABLE killrvideo.comments_by_video ADD sentiment_score float;
    ALTER TABLE killrvideo.comments_by_user ADD sentiment_score float;
  • Create a vector table for sentiment analysis on comments.
    CREATE TABLE killrvideo.comments_sentiment_vs (
      commentid timeuuid,
      comment text,
      sentiment_vector VECTOR<FLOAT, 5>,  -- Example dimension for sentiment
      PRIMARY KEY (commentid)
    );

4. Tag Generation for Videos

Automatically generate and suggest tags for new videos based on video descriptions and names using Natural Language Processing (NLP).

Data Model Changes:

  • Implementing AI to suggest tags doesn't require changes to the existing schema but integrating an AI model to process video uploads and updating the tags set in the videos table accordingly.

Vector Search Integration

To support AI-driven features like summarization, recommendation, sentiment analysis, and tag generation, vector search capabilities have been introduced into the data model. These capabilities enable similarity-based operations, leveraging the semantic understanding of content.

Example Usage with Generative AI:

  1. Recommendation Enhancements:

    • After generating embeddings for video content and user preferences, use vector search to find the closest matches for personalized recommendations.
  2. Sentiment Analysis on Comments:

    • Use vector search to find comments with similar sentiments, enabling features like filtering comments by positivity or negativity.

Conclusion

Enhancing the KillrVideo schema with Generative AI and vector search capabilities can significantly improve user experience through personalized content, better engagement through sentiment analysis, and efficient management of content through summarization and tagging. The vector search functionality in Cassandra adds a powerful tool for leveraging semantic similarities within the data, paving the way for advanced AI-driven features in applications.

cassandra_ai_accelerator's People

Contributors

pmcfadin avatar epinzur avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.