hasanhuz / spanemo Goto Github PK
View Code? Open in Web Editor NEWSpanEmo
License: Other
SpanEmo
License: Other
Hi!
I've trained on English SemEval2018 train and dev splits with default configurations, but haven't been able to get similar test results on SemEval2018 test split (E-c-En-test-gold.txt specifically).
For average of three models with the same default setting, I got
F1-Micro 0.6953, F1-Macro 0.53, JS 0.567
I read your results from your paper being higher, especially F1-Macro with around 4% difference. I want to reach out and see if there's something else beyond the default setting that I should note to reproduce the results.
Thank you so much.
Hey @hasanhuz!
I am a student and want to use your state of the art model for a school project, to analyze emotions in tweets and customer reviews.
Whenever i run the train.py file on GoogleColabs, i received the following error:
Traceback (most recent call last): File "/content/SpanEmo_MLEC/scripts/test.py", line 442, in <module> args = docopt(__doc__) File "/usr/local/lib/python3.7/dist-packages/docopt.py", line 558, in docopt DocoptExit.usage = printable_usage(doc) File "/usr/local/lib/python3.7/dist-packages/docopt.py", line 466, in printable_usage usage_split = re.split(r'([Uu][Ss][Aa][Gg][Ee]:)', doc) File "/usr/lib/python3.7/re.py", line 215, in split return _compile(pattern, flags).split(string, maxsplit) TypeError: expected string or bytes-like object
i am not very comfortable with Python and don't know whats wrong.
Preprocessing and model creation went well so far
Appreciate your help a lot!
T
Hi! I am trying to reproduce the Spanish part, but there have a ValueError: 'anticip' is not in list. I have already download bert-base-spanish-wwm-uncased.
Could you please provide the suggests for me?
Thanks in advance.
I just try to run the code, I download the dataset, but I got an error about "TypeError: linear(): argument 'input' (position 1) must be Tensor, not str". I check the code, and I find the two lines code maybe have a mistake.
I just change them to
output = self.bert(input_ids=input_ids)
return output.last_hidden_state
but I still got the same mistake, I don't know why.
Here is my colab code
I would appreciate it if you could help me again!
Hello authors,
I am trying to reproduce these results on the Spanish dataset, using all default parameters. I do understand that without joint loss there should be lower accuracy than with joint loss, but the difference is around 10 to 20 percent. I checked for overfitting and it does not seem to be an issue, plus models are being saved only for best cases of val loss. Do you reckon changing any of the default parameters per se?
Here are results on the test set:
F1-Micro: 0.4789 (expected 0.654)
F1-Macro: 0.3418 (expected 0.534)
JS: 0.3665 (expected 0.481)
Thanks for your time!
@hasanhuz.. Just for confirmation.. doesn't line- last_hidden_state, pooler_output = self.bert(input_ids=input_ids) returns last_hidden_state as a string rather than a tensor?
the paper is very awesome, I just have a problem, which I don't understand totally.
Does |C| in here is already chosen emotion category? Does C is fixed?
Hi! I am trying to reproduce the results of paper and while training, the best results I can get on dev set is
Val_loss F1-Ma. F1-Mi JS
0.3648 0.4966 0.6917 0.5657
And this values are reached at the second epoch. Im using the joint loss as the paper suggests, with alpha = 0.2. Is there anything else i should do?
Could you please provide the pretrained weights?
Thanks in advance.
When running test.py file, I got error below:
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found object
any idea on this?
Thanks!
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
Can you please let me know what GPU is required?
Salam Hassan
I was trying to run your code for SpanEmo paper. Interesting work by the way. However, I am not sure why when I try to run it on an Arabic dataset, it gives me the following errors and warnings.
AttributeError: 'NoneType' object has no attribute 'update'
Details are below
It looks like it is something related to fastprogress, but I tried every single possibility including upgrading and downgrading some libraries
Any help is appreciated
Thanks
usr/local/lib/python3.9/site-packages/google/colab/data_table.py:30: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.
from IPython.utils import traitlets as _traitlets
Currently using GPU: cuda:0
/usr/local/lib/python3.9/site-packages/ekphrasis/classes/tokenizer.py:225: FutureWarning: Possible nested set at position 2190
self.tok = re.compile(r"({})".format("|".join(pipeline)))
Reading twitter_2018 - 1grams ...
Reading twitter_2018 - 2grams ...
/usr/local/lib/python3.9/site-packages/ekphrasis/classes/exmanager.py:14: FutureWarning: Possible nested set at position 42
regexes = {k.lower(): re.compile(self.expressions[k]) for k, v in
Reading twitter_2018 - 1grams ...
PreProcessing dataset ...: 0% 0/178 [00:00<?, ?it/s]/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2323: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
warnings.warn(
PreProcessing dataset ...: 100% 178/178 [00:01<00:00, 122.84it/s]
The number of training batches: 6
Reading twitter_2018 - 1grams ...
Reading twitter_2018 - 2grams ...
Reading twitter_2018 - 1grams ...
PreProcessing dataset ...: 0% 0/178 [00:00<?, ?it/s]/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2323: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
warnings.warn(
PreProcessing dataset ...: 100% 178/178 [00:01<00:00, 89.23it/s]
The number of validation batches: 6
Some weights of the model checkpoint at asafaya/bert-base-arabic were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight']
no_deprecation_warning=True
to disable this warningHello,
Thank you for sharing your work on multi-label emotion recognition. I successfully ran the code and got some results, and I made sure to use the same parameters as described in the paper (made sure loss-type is joint loss). But, the model after four epochs seems to be overfitting a lot.
I was just wondering if you observed this phenomenon and what could be the cause if you didn't, I request you to kindly let me know if I'm doing something wrong.
Apart from the above question, I also had a doubt about what co-existing percentage refers to in the paper and how it was calculated.
I'd greatly appreciate some clarifications.
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.