Coder Social home page Coder Social logo

sentiment-analysis's Introduction

Plan of Action

[ ] Lab work

Complete module labs up to week 4 (or 5)

  • AuthorProfiling
  • AnalogySolver (opt)
  • WordNetPractice (opt)

[x] Feature Generation

Generate at least three feature sets using paths through the following set of choices:

  • Tokenization (must)
  • Lemmatization OR Stemming
  • Lowercase (or not)
  • Remove stopwords (or not)
  • Remove punctuation (or not)
  • N-gram generation (pick 1, 2, 3, ..., n)
  • Normalisation methods
    • Frequency Normalisation
    • Tf-Idf
    • PPMI

[ ] Naive Bayes

  • Implement Naive Bayes from scratch
  • Evaluate it on each of the three feature sets you've extracted (using the development splits)
  • Pick the best of the three feature sets you've experimented with and present the test split results
  • Evaluate the scikit-learn implementation of Naive Bayes on the same three feature sets and compare the results to your implementation

[ ] SGD-based classification and SVMs

For this task, you can use scikit-learn's implementation of the following models:

Logistic Regression

  • Train a Logistic Regression classifier
  • Evaluate each model on each of the three feature sets (using the respective development splits)
  • Pick the best of the three feature sets you've experimented with and present the test split results

SVM

  • Train a SVM classifier
  • Evaluate each model on each of the three feature sets (using the respective development splits)
  • Pick the best of the three feature sets you've experimented with and present the test split results

Subsequently, you should perform hyperparameter optimisation on your best performing model/feature set.

  • Try at least 5 different combinations of hyperparameters using the development split
    • Consider using more comprehensive hyperparameter optimisation techniques like gridsearch or k-fold validation.
  • Evaluate the best hyperparameters using the test split

Note that you should use 1-hot embeddings based on your features for input.

[x] BERT

  • Follow the given tutorials and train BERT to perform sentiment analysis over the same train/dev/test splits of the feature sets.
  • Experiment with both the cased and uncased version of BERT

[ ] ChatGPT

TODO: Come up with a plan.

sentiment-analysis's People

Contributors

wisaacj avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.