Coder Social home page Coder Social logo

nlp-intro's Introduction

Intro to NLP (pre-transformers)

https://drive.google.com/drive/folders/1lVRHX5XNEL90ieV8_7BP4fT1gPrxcQsQ?usp=sharing

Basics of sentiment analysis

Sentiment analysis aims to determine the affective state — something more persistent than an emotion and less persistent than a mood — of a particular text. There are two broad ways of approaching the task:

👉 Dictionaries: using lists of words grouped by categories. We count how many words from a category are in a text to determine its sentiment. Some well-validated dictionaries include Harvard General Inquirer 4, LIWC, and VADER.

Pros: simple, fast, transparent, lower risk of overfitting. Cons: sensitive to dictionary choice and scoring systems, don’t get sarcasm, high risk of underfitting.

👉 Classifiers: using machine learning techniques to determine sentiment. Usually, you would either need to have labeled data (ie some of your text items would need to be labeled by a human as belonging to a category), or use a model pre-trained by someone else. Some suitable models include Naive Bayes, Support Vector Machine, k-nearest neighbors, and neural nets.

Pros: pick up on nuances and get better results, don’t need to compile a dictionary. Cons: may need labeled data, less transparent, can be time-consuming to train, higher risk of overfitting.

In psychology, we prefer dictionaries over classifiers because that way it’s easier to relate the sentiment to psychological phenomena we’re studying as opposed to the quirks of the model. Additionally, classifiers can be way more costly since specialized categories require new human-labeled samples. It’s usually easier to creat a new dictionary with the necessary category (like moral emotional words, words expressing temptation, etc).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.