sentiment-analysis's Introduction

Plan of Action

[ ] Lab work

Complete module labs up to week 4 (or 5)

AuthorProfiling
AnalogySolver (opt)
WordNetPractice (opt)

[x] Feature Generation

Generate at least three feature sets using paths through the following set of choices:

Tokenization (must)
Lemmatization OR Stemming
Lowercase (or not)
Remove stopwords (or not)
Remove punctuation (or not)
N-gram generation (pick 1, 2, 3, ..., n)
Normalisation methods
- Frequency Normalisation
- Tf-Idf
- PPMI

[ ] Naive Bayes

Implement Naive Bayes from scratch
Evaluate it on each of the three feature sets you've extracted (using the development splits)
Pick the best of the three feature sets you've experimented with and present the test split results
Evaluate the scikit-learn implementation of Naive Bayes on the same three feature sets and compare the results to your implementation

[ ] SGD-based classification and SVMs

For this task, you can use scikit-learn's implementation of the following models:

Logistic Regression

Train a Logistic Regression classifier
Evaluate each model on each of the three feature sets (using the respective development splits)
Pick the best of the three feature sets you've experimented with and present the test split results

SVM

Train a SVM classifier
Evaluate each model on each of the three feature sets (using the respective development splits)
Pick the best of the three feature sets you've experimented with and present the test split results

Subsequently, you should perform hyperparameter optimisation on your best performing model/feature set.

Try at least 5 different combinations of hyperparameters using the development split
- Consider using more comprehensive hyperparameter optimisation techniques like gridsearch or k-fold validation.
Evaluate the best hyperparameters using the test split

Note that you should use 1-hot embeddings based on your features for input.

[x] BERT

Follow the given tutorials and train BERT to perform sentiment analysis over the same train/dev/test splits of the feature sets.
Experiment with both the cased and uncased version of BERT

[ ] ChatGPT

TODO: Come up with a plan.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

wisaacj / sentiment-analysis Goto Github PK

sentiment-analysis's Introduction

Plan of Action

[ ] Lab work

[x] Feature Generation

[ ] Naive Bayes

[ ] SGD-based classification and SVMs

[x] BERT

[ ] ChatGPT

sentiment-analysis's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent