Coder Social home page Coder Social logo

subhalingamd / nlp-contextual-word-meaning-comparision Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.5 MB

Evaluating context-sensitive word meaning understanding in pair of sentences for BERT and GloVe+BiLSTM using WiC dataset | A3 for COL772 course (Fall 21)

Jupyter Notebook 100.00%
wic wsd superglue disambiguation col772 nlp word-sense-disambiguation bert-embeddings glove-embeddings

nlp-contextual-word-meaning-comparision's Introduction

Context-Sensitive Word Sense Disambiguation

  1. Motivation
  2. Problem Statement
  3. Dataset
  4. Evaluation
  5. Methodology
    1. BiLSTM + GloVe
    2. BERT
  6. Running the code
  7. Results

Motivation

Words often have multiple meanings associated with them and the surrounding context often determines the specific meaning of the word which is used. The goal of this assignment is to develop deep neural models that can identify whether a particular word used in a sentence pair has the same meaning in both sentences or has a different meaning in each of the sentences.

For example,

S1: We often used to play on the [bank] of the river.
S2: We lived along the [bank] of the Ganges.
S3: He cashed a check at the [bank].

S1 and S2 use the same meaning of the word bank (river bed) while S1 and S3 use different meanings of the word bank (river bed vs. financial institution).

Problem Statement

We frame the task as a classification problem, in which the input sentence pair (X) and the word (W) must be classified into a label T if W has the same meaning in both sentences of X and F if W has different meanings.

X, W Ground Truth Label
(S1, S2), bank T
(S1, S3), bank F
(S2, S3), bank F

Dataset

Dataset used: WiC: The Word-in-Context Dataset (English) [Link]

The *.data.txt correspond to the input file and *.gold.txt correspond to the final labels. Further details on the format can be found from the link mentioned above.

Evaluation

The performance on the task will be evaluated using binary accuracy.

Methodology

This assignment is divided into two parts:

BiLSTM + GloVe

In the first part, we build a baseline without using contextual embeddings, pretrained models, external knowledge, additional datasets, etc. The goal is to understand how contextual embeddings are important in NLP.

For this, we make use of GloVe embeddings and build BiLSTM model. The hyperparameters can be found directly from args in the notebook.

BERT

In this part, we do not have any restrictions, other than constraints on availability of computational resources.

We use BERT in this part. More specifically, we use BertForSequenceClassification from Hugging Face's Transformers library and start finetuning from bert-base-cased checkpoint.

Each data-point is encoded as [CLS] sentence1 [SEP] sentence2 [SEP] (the target word, indices, POS info are not used) which is classified into 0/1. AdamW is used for optimization. The hyperparameters can be found directly from args in the notebook.

Further, the dataset is augmented with (subset of) MCL-WiC dataset from SemEval 2021 Task 2 while finetuning.

Running the code

This assignment uses Python Notebooks. The files organization is given below:

    |-- _data/             : original dataset
    |-- _add_data/         : additional data used
    |-- A__BiLSTM_GloVe/   : training and inference notebooks for part 1
    |-- B__BERT/           : training and inference notebooks for part 2
    |-- README.md              

If you are planning to run the code by yourself, you might want to make changes in the first and last (few) cell(s) and the file paths in args. Following this, you can click on Run All and run all the cells.

Results

Model Val. Acc. Test Acc.
A. BiLSTM + GloVe 58.93 54.57
B. BERT (w/ augmentation) 72.10 68.43

This README uses texts from the assignment problem document provided in the course.

nlp-contextual-word-meaning-comparision's People

Contributors

subhalingamd avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.