Coder Social home page Coder Social logo

omidrohanian / irony_detection Goto Github PK

View Code? Open in Web Editor NEW
15.0 2.0 10.0 688 KB

Code and data used for participation in SemEval-2018 Task 3: "Irony detection in English tweets"

Jupyter Notebook 55.52% Python 44.48%
irony irony-detection semeval sentiment-analysis twitter sarcasm-detection sarcastic-tweets ironic

irony_detection's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

irony_detection's Issues

IndexError: invalid index to scalar variable

When running the feature_generator_TaskA notebook, specifically cell 9, I get the following error:

image

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-0b9719d949b3> in <module>
      5 for tweet in corpus_preprocessed:
      6     chunks = chunkIt(tweet, 2)
----> 7     polarity_vectors.append(np.concatenate(((polarity(chunks[0])[1], polarity(chunks[1])[1])), axis=0))
      8 
      9 assert len(ekphrasis_feats) == len(polarity_vectors)

~/Documents/scriptie/irony_detection/venv/lib/python3.6/site-packages/ekphrasis/utils/nlp.py in polarity(doc, neg_comma, neg_modals)
    213 
    214     _scores = numpy.mean(numpy.array(scores), axis=0)
--> 215     _polarity = _scores[0] - _scores[1]
    216 
    217     return _polarity, _scores

IndexError: invalid index to scalar variable.

All the preceding cells seem to run fine, so I don't know what could be causing this. Any ideas?

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2779: character maps to <undefined>

I tried running the feature_generator_TaskA.ipynb and it gave me this error.


UnicodeDecodeError Traceback (most recent call last)
in ()
1 if TRAIN:
2 dataset='../datasets/train/SemEval2018-T3-train-taskA_emoji.txt'
----> 3 corpus, _ = parse_dataset(dataset)
4 corpus_preprocessed = json.load(open('../extra_resources/train_preprocessed.txt','r'))
5 else:

~\Documents\Special Problem\198.2\irony_detection\subtaskA\load.py in parse_dataset(dataset)
5 dataset_name = dataset.lower()
6 with open(dataset, 'r') as data_in:
----> 7 for line in data_in:
8 if not line.lower().startswith("tweet index"): # discard first line if it contains metadata
9 line = line.rstrip() # remove trailing whitespace

~\Anaconda3\lib\encodings\cp1252.py in decode(self, input, final)
21 class IncrementalDecoder(codecs.IncrementalDecoder):
22 def decode(self, input, final=False):
---> 23 return codecs.charmap_decode(input,self.errors,decoding_table)[0]
24
25 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2779: character maps to

Recursive Feature Elimination leaves out contrast

Hi, for a uni-project, I am trying to reproduce the paper and see if I am able to get the same results.
I picked your paper because I liked the paper and it seemed quite accessible, seeing everything is readily available on GitHub, which is really nice!

I have run the project multiple times, mimicking your setup as closely as possible. However, I am unable to get the 13 features as mentioned in your paper. I am only getting 12 as selected by RFEC (leaving out the feature contrast). I have tried on multiple environments (a couple of windows and one mac). It seems to be related to the "train_feats_taskA.npy"-file when I regenerate it. If I don't regenerate it I will have the same set of features as mentioned in your paper.

I have thought of it being related to using a newer version of Stanford CoreNLP and the other packages, however, I thought maybe could shed some light on this. Do you maybe have a suggestion on what this could be related to?

The code jupyter notebook "feature_generator_TaskA.ipynb" is the same as your repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.