Coder Social home page Coder Social logo

sedflix / unsacmt Goto Github PK

View Code? Open in Web Editor NEW
8.0 4.0 4.0 2.88 MB

Unsupervised Sentiment Analysis for Code-mixed Data

Home Page: https://arxiv.org/abs/2001.11384

Dockerfile 0.02% Jupyter Notebook 96.19% Python 3.79%
sentiment-analysis code-mixed code-switching embeddings cross-lingual multi-lingual zero-shot-learning unsupervised

unsacmt's Introduction

Unsupervised Sentiment Analysis for Code-mixed Data

We use embeddings techniques like MUSE, LASER, XLM, MutltiBPEemd, fasttext to efficiently transfer knowledge from monolingual test to code-mix text for sentiment analysis of code-mixed text. More information about the methods tried here can be found in here.

Environment

All the dependencies of the code are listed in requirements.txt.

pip

    pip install -r requirements.txt
    PYTHONIOENCODING=utf-8 python -m laserembeddings download-models

docker

    # build the image 
    docker build -t unsacmt .
    
    # run the container
    nvidia-docker run -v $PWD:/app -p 8989:8989 unsacmt
    
    # launch a jupyter notebook
    jupyter notebook --ip 0.0.0.0 --port 8989 --allow-root

Data

The Sentiment Analysis data is present is data/cm/.
The custom fastText embedding is provided here. # TODO
The aligned MUSE embedding is provided here. # TODO

Files Description

  • notebooks/archive/*.ipynb: old notebooks with many more experiments than mentioned in the paper.
  • notebooks/Results.ipynb: a notebook with all the experiments
  • src/utills.py: code for reading raw data and f1 score
  • src/trainer.py: code for following training curriculum given the model and data
  • src/models.py: code for simple neural network models used by use
  • src/data_prep.py: code for applying different kinds of embeddings on sentiment analysis dataset

Citation

@misc{yadav2020unsupervised,
    title={Unsupervised Sentiment Analysis for Code-mixed Data},
    author={Siddharth Yadav and Tanmoy Chakraborty},
    year={2020},
    eprint={2001.11384},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

unsacmt's People

Contributors

sedflix avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.