Coder Social home page Coder Social logo

stance-conditional's Issues

cannot read the son file when do word2vec training

I find that when I try to read the File (additionalTweetsStanceDetection.json and additionalTweetsStanceDetectionBig.json) to train word2vec,I can't decode the json file and there are some UnicodeError. Could you please help to train the word2vec model with 300 dimensions? Thank you very much!
[
default

can not find automatic labeling method

In the paper, it said, "the automatic labeling method is publicly available.", but I can not find the automatic labeling code in this repository.
Can you provide the automatic labeling code, please?
Thanks.

problem with imports

Got a fresh copy from github and initialized modules and pythonpath:

echo $PYTHONPATH
/Users/andreasvlachos/Work/git/stance-conditional/twokenize_wrapper:/Users/andreasvlachos/Work/git/stance-conditional/

But when I run:
stancedetection andreasvlachos$ python3 word2vec_training.py

I get:
Traceback (most recent call last):
File "word2vec_training.py", line 4, in
from preprocess import tokenise_tweets, build_dataset, transform_tweet, transform_labels
File "/Users/andreasvlachos/Work/git/stance-conditional/stancedetection/preprocess.py", line 4, in
from twokenize_wrapper.twokenize import tokenize
ImportError: No module named 'twokenize_wrapper.twokenize'; 'twokenize_wrapper' is not a package

Could you check? I was getting past this in the previous commit.

Replicating the results.

We have experimented a lot using the model and are unable to replicate the results of the paper. We have got a max F1 Score of 0.5637 for the hyperparameters already present in the code. Any changes to them and even on changing random seed deteriorates the result.
What can we do to replicate the results?

fullp variable

Run conditional.py successfully, but in the end I got the message. Had a look in conditional.py but fullp is not set anywhere.

Applying to test data, getting predictions for NONE/AGAINST/FAVOR
Num testing samples 70 Acc 0.34285714285714286 Correct 24 Total 70
Num testing samples 140 Acc 0.40714285714285714 Correct 57 Total 140
Num testing samples 210 Acc 0.4 Correct 84 Total 210
Num testing samples 280 Acc 0.40714285714285714 Correct 114 Total 280
Num testing samples 350 Acc 0.3914285714285714 Correct 137 Total 350
Num testing samples 420 Acc 0.3761904761904762 Correct 158 Total 420
Num testing samples 490 Acc 0.37551020408163266 Correct 184 Total 490
Num testing samples 560 Acc 0.38392857142857145 Correct 215 Total 560
Num testing samples 630 Acc 0.3904761904761905 Correct 246 Total 630
Num testing samples 700 Acc 0.3942857142857143 Correct 276 Total 700
Num testing samples 770 Acc 0.4090909090909091 Correct 315 Total 770
Traceback (most recent call last):
File "conditional.py", line 727, in
readInputAndEval(tests, outfile, hid, max_epochs, "tanh", drop, "most", str(i), modelt, w2v, acc_thresh=1)
File "conditional.py", line 668, in readInputAndEval
writer.eval(testdata, outfile, evalscript=fullp + "eval.pl")
NameError: name 'fullp' is not defined

generating stance results with new dataset?

I am trying to figure out what is the best way to utilize what you have and move beyond semeval-2016 datasets. I was able to follow/execute word2vec_training.py and generate the out file "skip_nostop_single_100features_5minwords_5context_big". The output file looks like a binary file and not readable.(is that just a model file?) . I am wondering how to use this code to take a list of tweets and produce a file resemble the gold_toy.txt file from eval.pl. Ideally,I'd take a json of tweets and then get a list of AGAINST, PRO, NONE value towards target.

This is not really an issue.

Add parameter for changing loss for NONE

...because it doesn't contribute to Macro F1

Ignore the NONE training examples:

  • multiplying the total loss by 1-target[0](so if the class is NONE the loss is 0)
  • remove NONE examples from the training+dev corpus.

Or downweigh them:
loss = (1-(target[0] * (1 - alpha))) * loss

So for target class NONE and alpha 0.1 we would get:
loss = (1-(1_0.9))_loss = 0.1*loss

Bi-directional conditional encoding

  • Encode target bi-directionally: you get vectors y_l and y_r (one representation from reading the target from left, the other one from reading the target from the right)
  • Conditionally encode tweet, but in a bi-directional way, that is (i) encode from left to right with initialization using y_l yielding x_l and (ii) encode from right to left with initialization using y_r yielding x_r
  • For classification concatenate x_l and x_r

Attention

Since targets and tweets are quite short attention might not have a big impact here, hence low priority.

Feeding target representation during tweet processing

Another variant of conditional encoding that could be helpful is to encode the target resulting in a vector v_t and then feed the word representations of the tweet as well as the target representation at every step of LSTM encoding. That way the tweet LSTM does not have to maintain the representation of the target in it's memory.

downloaded_Donald_Trump.txt

When I run it, I get this:

python word2vec_training.py 
Traceback (most recent call last):
  File "word2vec_training.py", line 67, in <module>
    tweets_trump, targets_trump, labels_trump, ids_trump = reader.readTweetsOfficial("../data/downloaded_Donald_Trump.txt", "utf-8", 1)
  File "/Users/andreasvlachos/Work/git/stance-conditional/readwrite/reader.py", line 17, in readTweetsOfficial
    for line in io.open(tweetfile, encoding=encoding, mode='r'):
FileNotFoundError: [Errno 2] No such file or directory: '../data/downloaded_Donald_Trump.txt'

Is downloaded_Donald_Trump.txt the same file as downloaded_Donald_Trump_all.txt (which is in the dropbox folder)?

Version issue

  Hello, I am run your paogram, but  my python version is 3.6, so have some issue, can you give me some idea to deal with it?

File not found

After running word2vec successfully, I get the file named:

skip_nostop_single_100features_5minwords_5context

I am running:

python3 conditional.py

But I get:

FileNotFoundError: [Errno 2] No such file or directory: '/Users/Isabelle/Documents/TextualEntailment/SemEvalStance/stance-conditional-acl2016-fresh/out/skip_nostop_single_100features_5minwords_5context_big'

I looked at the code and it seems like there is some hard coding and I am not sure which branch I should follow. Could you have a look?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.