Coder Social home page Coder Social logo

vidhi1290 / essay-quality-prediction-keystroke-analysis-with-randomforest Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 648 KB

Explore the Essay Quality Prediction projectβ€”a machine learning model that predicts essay quality based on typing behaviors. Leveraging a Random Forest Regressor, this tool provides insights into writing processes. Connect with me on LinkedIn and find more projects on GitHub. Happy coding! πŸ“βœ¨

Jupyter Notebook 100.00%
kaggle-competition machine-learning mean-square-error nlp-machine-learning numpy pandas random-forest random-forest-regressor scikitlearn-machine-learning seaborn

essay-quality-prediction-keystroke-analysis-with-randomforest's Introduction

Essay Quality Prediction with Keystroke Analysis πŸ“πŸ’»

Welcome to the Essay Quality Prediction project, where we delve into the fascinating realm of typing behavior to predict the quality of essays! πŸš€

Introduction

This repository houses a machine learning model that harnesses the power of keystroke analysis to predict the quality of essays. By extracting insights from users' typing behaviors during the writing process, our model aims to provide a unique perspective on the art of essay composition.

Dataset πŸ“Š

The dataset comprises approximately 5000 logs of user inputs, including keystrokes and mouse clicks, recorded during the creation of essays. Each essay is scored on a scale of 0 to 6. The challenge is to predict the score an essay received based on its log of user inputs.

Features 🧐

Our model leverages a variety of typing behavior features, including:

  • Total number of activities
  • Total and average action time
  • Maximum word count
  • Number of unique text changes
  • Average cursor position

Model Architecture πŸ€–

The model is constructed using a Random Forest Regressor with 100 estimators. We chose this ensemble learning technique for its robustness and ability to handle complex relationships within the data.

Acknowledgments πŸ™Œ

We extend our gratitude to Vanderbilt University, the competition host, and The Learning Agency Lab, the independent nonprofit based in Arizona. Their support has been instrumental in fostering cross-disciplinary research with global impact.

Usage πŸš€

  1. Data Preprocessing and Feature Engineering: Load the training data, merge logs and scores, and engineer typing behavior features.
  2. Model Training: Utilize a Random Forest Regressor to train the model on the prepared dataset.
  3. Evaluation: Assess the model's performance using Mean Squared Error on the validation set.
  4. Prediction: Deploy the trained model to make predictions on the test set and create a submission file.

Visualizations πŸ“ˆ

Explore the relationships between typing behavior features and essay scores through captivating visualizations:

  • Scatter plots depicting the influence of activities, action time, word count, text changes, and cursor position on essay scores.
  • A residual plot offering insights into prediction errors.
  • A feature importance plot showcasing the significance of different features.
  • A histogram illustrating the distribution of predicted scores.

Dependencies πŸ› οΈ

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • scikit-learn
  • xgboost

Results πŸ“Š

  • Mean Squared Error on Validation Set: 0.42

Connect with Me 🌐

Let's connect and collaborate! Feel free to reach out to me on:

I'm always open to discussions, collaborations, and learning new things together. Don't hesitate to drop me a message or explore my other projects on GitHub. Happy coding! πŸš€

Feel free to dive into the code, experiment with the features, and explore the nuances of writing quality prediction through keystroke analysis! πŸ•΅οΈβ€β™‚οΈπŸ’¬

Happy coding! πŸš€

essay-quality-prediction-keystroke-analysis-with-randomforest's People

Contributors

vidhi1290 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.