Coder Social home page Coder Social logo

habeebmoosa / ai-text-detector Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 77.64 MB

This is a AI generated text detection Model

Home Page: https://aitextdetector.vercel.app

License: MIT License

Python 7.70% HTML 12.01% Jupyter Notebook 80.29%
ai ai-generated-text artificial-intelligence chatgpt generative-ai machine-learning machine-learning-algorithms nlp python text-classification ai-generated-text-detection

ai-text-detector's Introduction

AI Generated Text Detection

This project aims to classify text as either human-generated or AI-generated. It utilizes a variety of natural language processing (NLP) features and machine learning algorithms to achieve this classification task.

Features

The following features are extracted from the provided dataset:

  • Basic NLP Features:
    • Char count, word count, word density, punctuation count, title word count, upper-case count, noun count, adverb count, verb count, adjective count, pronoun count.
  • Term Frequencies and N-gram:
    • Count vectorizer with 35742 features.
    • Bigram words (5000 features).
    • Trigram words (5000 features).
    • BiTrigram characters (5000 features).
  • Topic Modeling:
    • NeuralLDA with 20 topics.
  • Others:
    • Readability score, Named Entity Recognition (NER) count, text error length, and Lexical Diversity.

Feature Selection

After feature extraction, Principal Component Analysis (PCA) is applied with n_components set to 256 for feature selection.

Algorithms

The project utilizes five different algorithms for training and testing:

  1. Random Forest
  2. Support Vector Machine (SVM)
  3. XGBoost
  4. Gradient Boosting
  5. Logistic Regression

Performance

Among the five algorithms tested, Gradient Boosting demonstrated superior performance. It provided accurate classification results during the prediction phase.

Flask Application

A simple Flask application is developed to demonstrate the functionality of the AI Generated Text Detection model. Users can input text, and the application will classify it as either human-generated or AI-generated.

Usage

To use the project:

  1. Clone the repository from GitHub.
  2. Install the required dependencies.
  3. Run the Flask application.
  4. Input text to classify whether it is human-generated or AI-generated.

Contributors

License

This project is licensed under the MIT License.

ai-text-detector's People

Contributors

habeebmoosa avatar

Watchers

 avatar

ai-text-detector's Issues

data

Hi can you provide your dataset for us? i didnt find the data you used

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.