Coder Social home page Coder Social logo

stance-detection's Introduction

stance-detection

Code for my MSc thesis: Statistical Modeling Of Stance Detection.

Abstract

In recent years fake news has become a more serious problem. This is mainly due to the popularity of social networks, search engines and news aggregators that propagate fake news. Classifying news as fake is a hard problem. However, it is possible to distinguish between fake and real news, by considering how many related tweets agree/disagree with the news. Therefore, in the simplest case the problem can be reduced to identifying whether a given tweet agrees with, disagrees with or is unrelated to the news in question. In general, this problem is referred to as ’stance detection’. In machine learning terminology this is a classification problem. This thesis investigates more advanced Natural Language Models, such as matching Long Short Term Memory model and soft attention mechanism applied to stance detection problem. The ideas are tested using a publicly available data set.

Long Short Term Memory (LSTM)

LSTM is a non-linear hidden time-series model. It is based on Recurrent Neural Network (RNN). The basic idea is similar to the Hidden Markov Model, i.e. the observable time-series is modelled by hidden state representation. RNN can be defined inductively in the following way

Given a time-series {x_1, x_2, ... x_T}:

  1. Initialize hidden state: h_0 = 0
  2. Update h_1 = f(W_h * h_0 + W_x * x_1)
  3. Fit the model by predicting x_2: min (W * h_1 - x_2) with respect to W_h, W_x, W.

W_h, W_x, W are estimated weight matrices, f is an activation function, usually sigmoid or tangh. Note that W_h, W_x and W are the same for each step in the time series.

However, estimation of RNN is complicated by the Vanishing/Exploding gradient problem. During gradient descent updates the gradient might vanish or diverge to infinity. The gradient has the form of g^T, i.e. some expression raised to the power T (length of the time series sequence). Hence, if g becomes small, the gradient g^T vanishes, and if g is big enough, g^T explodes. LSTM introduced gating mechanism which mitigates the problem. In short LSTM might 'forget' the h_t or x_t. LSTM dynamically changes importance weight of h_t and x_t at each step. A very good introduction can be found here: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Soft attention

Soft attention arose from the alignment problem in machine translation. For example, given a sentence in English

{e_1, e_2, e_3}

and its corresponding translation in French

{f_1, f_2, f_3, f_4}

the problem is to align groups of corresponding words, i.e. (e_1 ~ f_1, f_2, f_3), (e2, e3 ~ f_4). This correspondence can be represented by 3x4 matrix with 3 rows for English words and 4 columns for French words:

[[1, 1, 1, 0]

[0, 0, 0, 1]

[0, 0, 0, 1]]

Soft attention mechanism tries to learn these correspondence by modeling the weighting matrix in a soft way. Soft here means that the weights are not necessarily 0, 1, but rather from 0 to 1. This allows the model to be end-to-end differentiable and thus can be estimated by gradient descent. This attention mechanism can be easily adapted to a more general class of time-series problems.

For more details see the thesis (file inside the repo): Mavrin_Borislav_201709_MSc.pdf

How to run code:

1. Install and activate virtual environment:

cd stance-detection
pip2 install virtualenv
virtualenv -p python2 .env
source .env/bin/activate

2. Install dependencies:

pip2 install -r requirements.txt

3. Install stopword corpus into home folder:

python2 -c "import nltk nltk.download('stopwords')"

Note: installing modules from requirements.txt is crucial since the TensoFlow API was changing a lot at the time of the creation of the code.

stance-detection's People

Contributors

borislavmavrin avatar

Stargazers

 avatar Diwank Singh Tomer avatar Shijie Ren avatar Boris Polonsky avatar  avatar Shubham Pachori avatar Franklin (Ben) Bradfield avatar  avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.