omidrohanian / irony_detection Goto Github PK

View Code? Open in Web Editor NEW

15.0 2.0 10.0 688 KB

Code and data used for participation in SemEval-2018 Task 3: "Irony detection in English tweets"

Jupyter Notebook 55.52% Python 44.48%

irony irony-detection semeval sentiment-analysis twitter sarcasm-detection sarcastic-tweets ironic

irony_detection's People

Stargazers

Watchers

Forkers

luciekuiper databill86 rahulpa38 inbalcroitoru hunterlc clorofilla martysteer nikgautam liyaozhang harel-coffee

irony_detection's Issues

IndexError: invalid index to scalar variable

When running the feature_generator_TaskA notebook, specifically cell 9, I get the following error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-0b9719d949b3> in <module>
      5 for tweet in corpus_preprocessed:
      6     chunks = chunkIt(tweet, 2)
----> 7     polarity_vectors.append(np.concatenate(((polarity(chunks[0])[1], polarity(chunks[1])[1])), axis=0))
      8 
      9 assert len(ekphrasis_feats) == len(polarity_vectors)

~/Documents/scriptie/irony_detection/venv/lib/python3.6/site-packages/ekphrasis/utils/nlp.py in polarity(doc, neg_comma, neg_modals)
    213 
    214     _scores = numpy.mean(numpy.array(scores), axis=0)
--> 215     _polarity = _scores[0] - _scores[1]
    216 
    217     return _polarity, _scores

IndexError: invalid index to scalar variable.

All the preceding cells seem to run fine, so I don't know what could be causing this. Any ideas?

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2779: character maps to <undefined>

I tried running the feature_generator_TaskA.ipynb and it gave me this error.

UnicodeDecodeError Traceback (most recent call last)
in ()
1 if TRAIN:
2 dataset='../datasets/train/SemEval2018-T3-train-taskA_emoji.txt'
----> 3 corpus, _ = parse_dataset(dataset)
4 corpus_preprocessed = json.load(open('../extra_resources/train_preprocessed.txt','r'))
5 else:

~\Documents\Special Problem\198.2\irony_detection\subtaskA\load.py in parse_dataset(dataset)
5 dataset_name = dataset.lower()
6 with open(dataset, 'r') as data_in:
----> 7 for line in data_in:
8 if not line.lower().startswith("tweet index"): # discard first line if it contains metadata
9 line = line.rstrip() # remove trailing whitespace

~\Anaconda3\lib\encodings\cp1252.py in decode(self, input, final)
21 class IncrementalDecoder(codecs.IncrementalDecoder):
22 def decode(self, input, final=False):
---> 23 return codecs.charmap_decode(input,self.errors,decoding_table)[0]
24
25 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2779: character maps to

Recursive Feature Elimination leaves out contrast

Hi, for a uni-project, I am trying to reproduce the paper and see if I am able to get the same results.
I picked your paper because I liked the paper and it seemed quite accessible, seeing everything is readily available on GitHub, which is really nice!

I have run the project multiple times, mimicking your setup as closely as possible. However, I am unable to get the 13 features as mentioned in your paper. I am only getting 12 as selected by RFEC (leaving out the feature contrast). I have tried on multiple environments (a couple of windows and one mac). It seems to be related to the "train_feats_taskA.npy"-file when I regenerate it. If I don't regenerate it I will have the same set of features as mentioned in your paper.

I have thought of it being related to using a newer version of Stanford CoreNLP and the other packages, however, I thought maybe could shed some light on this. Do you maybe have a suggestion on what this could be related to?

The code jupyter notebook "feature_generator_TaskA.ipynb" is the same as your repo.

omidrohanian / irony_detection Goto Github PK

irony_detection's People

Stargazers

Watchers

Forkers

irony_detection's Issues

IndexError: invalid index to scalar variable

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2779: character maps to <undefined>

Recursive Feature Elimination leaves out contrast

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent