Coder Social home page Coder Social logo

nicolay-r / rusentrel-leaderboard Goto Github PK

View Code? Open in Web Editor NEW
7.0 4.0 0.0 1.92 MB

This is an official Leaderboard for the RuSentRel-1.1 dataset originally described in paper (arxiv:1808.08932)

Home Page: https://github.com/nicolay-r/RuSentRel

License: MIT License

Python 100.00%
sentiment-analysis relation-extraction language-models neural-networks bert-model low-resource-nlp benchmark leaderboard cnn bilstm

rusentrel-leaderboard's Introduction

RuSentRel Leaderboard

📓 Update 01 October 2023: this collection is now available in arekit-ss for a quick sampling of contexts with all subject-object relation mentions with just single script into JSONL/CSV/SqLite including (optional) language transferring 🔥 [Learn more ...]

Dataset description: RuSentRel collection consisted of analytical articles from Internet-portal inosmi.ru. These are translated into Russian texts in the domain of international politics obtained from foreign authoritative sources. The collected articles contain both the author's opinion on the subject matter of the article and a large number of references mentioned between the participants of the described situations. In total, 73 large analytical texts were labeled with about 2000 relations.

This repository is an official results benchmark for automatic sentiment attitude extraction task within RuSentRel collection. Let's follow the task section for greater details.

Contributing: Please feel free to make pull requests, and at awesome-sentiment-attitude-extraction especially!

For more details about RuSentRel please proceed with the related repository.

Contents

Task

Given a subset of documents in the RuSentRel collection, where each document is presented by a pair: (1) text, (2) a list of selected named entities. For each document, it is required to complete a list of such entity pairs (es, eo), for which text conveys the presence of sentiment relation from the es (subject) towards an eo (object). Label assignation can be neg or pos.

Example
... При этом Москва неоднократно подчеркивала, что ее активность на балтике является ответом именно на действия НАТО и эскалацию враждебного подхода к Росcии вблизи ее восточных границ ... (... Meanwhile Moscow has repeatedly emphasized that its activity in the Baltic Sea is a response precisely to actions of NATO and the escalation of the hostile approach to Russia near its eastern borders ...)
(NATO->Russia, neg), (Russia->NATO, neg)

Task paper: https://arxiv.org/pdf/1808.08932.pdf

Approaches

The task is considered as a context classification problem, in which context is a text region with mentioned pair (attitude participants) in it. Then classified context-level attitudes transfers onto document-level by averaging context labels of the related pair (using the voting method).

We implement AREkit toolkit which becomes a framework for the following applications:

  • BERT-based language models [code];
  • Neural Networks with (and w/o) Attention mechanism [code];
  • Conventional Machine Learning methods [code];

Back to Top

Submission Evaluation

Source code exported from AREkit-0.21.0 library and yields of:

  • Evaluation directory for details of the evaluator implementation and the related dependencies;
  • Test directory, which includes test scripts that allow applying evaluator for the archived results.

Use evaluate.py to evaluate your submissions. Below is an example for assessing the results of ChatGPT-3.5-0613:

python3 evaluate.py --input data/chatgpt-avg.zip --mode classification --split cv3

Back to Top

Leaderboard

Results ordered from the latest to the oldest. We measure F1 (scaled by 100) across the following foldings (see evaluator section for greater details):

  • F1cv - the average F1 of a 3-fold CV check; foldings carried out by preserving the same number of sentences in each of them;
  • Ft -- F1 over the predefined TEST set;

The result assessment organized in experiments:

  • 3l -- subject-object pairs extraction.
  • 2l -- classification of already given subject-object pairs on document level;
Methods F1cv (3l) F1t (3l) F1cv (2l) F1t (2l)
Expert Agreement** [1] 55.0 55.0 - -
ChatGPT zero-shot with promptings*** [7]
ChatGPT3.5-0613, avg [200 words distance] 37.7 39.6
ChatGPT3.5-0613, avg [50 words distance] 66.19 74.47
ChatGPT3.5-0613, first [50 words distance] 69.23 74.09
Distant SupervisionRA-2.0-large for Language Models (BERT-based) [6]
[pt -- pretrained, ft -- fine-tunded]
SentenceRuBERT (NLIpt + NLIft) 39.0 38.0 70.2 67.7
SentenceRuBERT (NLIpt + QAft) 38.4 41.9 69.6 64.2
SentenceRuBERT (NLIpt + Cft) 37.9 39.8 70.0 69.8
RuBERT (NLIpt + NLIft) 36.8 39.9 71.0 68.6
RuBERT (NLIpt + QAft) 34.8 37.0 69.6 68.2
RuBERT (NLIpt + Cft) 35.6 35.4 70.0 69.8
mBase (NLIpt + NLIft) 33.6 36.0 69.4 68.2
mBase (NLIpt + QAft) 30.1 35.5 69.6 65.2
mBase (NLIpt + Cft) 30.5 31.1 68.9 67.7
Distant SupervisionRA-2.0-large for (Attentive) Neural Networks + Frames annotation [Joined Training] [6]reproduced, [4]original
PCNNends 32.2 39.9 70.2 67.8
BiLSTM 32.0 38.8 71.2 68.4
PCNN 31.6 39.7 69.5 70.5
LSTM 31.6 39.5 68.0 75.4
Att-BiLSTM [P.Zhou et. al] 31.0 37.3 66.2 71.2
AttCNNends 30.9 39.9 66.8 72.7
IANends 30.7 36.7 69.1 72.6
Distant SupervisionRA-1.0 for Multi-Instance Neural Networks [Joined Training] [5]
MI-PCNN 68.0
MI-CNN 62.0
PCNN 67.0
CNN 63.0
Language Models (BERT-based) [6]
SentenceRuBERT (NLI) 33.4 32.7 69.8 67.6
SentenceRuBERT (QA) 34.3 38.9 70.2 67.1
SentenceRuBERT (C) 34.0 35.2 69.3 65.5
RuBERT (NLI) 29.4 39.6 68.9 66.4
RuBERT (QA) 32.0 35.3 69.5 66.2
RuBERT (C) 36.8 37.6 67.8 66.2
mBase (NLI) 29.2 37.0 67.8 58.4
mBase (QA) 28.6 33.8 66.5 65.4
mBase (C) 26.9 30.0 67.0 68.9
(Attentive) Neural Networks + Frames annotation ([6]reproduced, [3]original)
IANends 30.8 32.2 60.8 63.5
AttPCNNends 29.9 32.6 64.3 63.3
PCNN 29.6 32.5 64.4 63.3
CNN 28.7 31.4 63.6 65.9
BILSTM 28.6 32.4 62.3 71.2
LSTM 27.9 31.6 61.9 65.3
AttCNNends 27.6 29.7 65.0 66.2
Att-BiLSTM [P.Zhou et. al] 27.5 32.3 65.7 68.2
Convolutional networks [2]
PCNN [code] 0.31
CNN 0.30
Conventional methods [1] [code]
Gradient Boosting (Grid search) 20.3* 28.0
Random Forest (Grid search) 19.1* 27.0
Random Forest 15.7* 27.0
Naive Bayes (Bernoulli) 15.2* 16.0
SVM 15.1* 15.0
Gradient Boosting 14.4* 27.0
SVM (Grid search) 14.3* 15.0
NaiveBayes (Gauss) 9.2* 11.0
KNN 7.0* 9.0
Baseline (School) [link] 12.0
Baseline (Distr) 8.0
Baseline (Random) 7.4* 8.0
Baseline (Pos) 3.9* 4.0
Baseline (Neg) 5.2* 5.0

*: Results that were not mentioned in papers.

**: We asked another super-annotator to label the collection, and compared her annotation with our gold standard using average F-measure of positive and negative classes in the same way as for automatic approaches. In such a way, we can reveal the upper border for automatic algorithms. We obtained that F-measure of human labeling. [1]

***: We consider translation into english samples via the arekit-ss by translating texts into english first, and then wrapping them into prompts. We consider a k-words distance (50 by default, in english) between words as a upper bound for pairs organization; because of the latter and prior standards, results might be lower (translation increases distance in words).

Back to Top

Neural Networks Optimization

The training process is described in Rusnachenko et. al., 2020 (section 7.1) and relies on the Multi-Instance learning approach, originally proposed in Zeng et. al., 2015 paper. (SGD application, bags terminology, instances selection within bags). All the batch context samples are gathered into bags. Authors propose to select the best instance in every bag as follows: calculate the max value of p(yi|mi,j) across i'th values within a particular j'th bag. The latter allows them to adopt loss function on bags level.

In our works, we adopt bags for synonymous context gathering. Therefore, for gradients calculation within bags, we choose avg function instead. The assumption here is to consider other synonymous attitudes during the gradients calculation procedure. We use BagSize > 1 in earlier work Rusnachenko, 2018 In the latest experiments, we consider BagSize = 1 and therefore don't exploit bag values averaging.

Back to Top

Related works

Awesome

Awesome Sentiment Attitude Extraction

Back to Top

References

[1] Natalia Loukachevitch, Nicolay Rusnachenko Extracting Sentiment Attitudes from Analytical Texts Proceedings of International Conference on Computational Linguistics and Intellectual Technologies Dialogue-2018 (arXiv:1808.08932) [paper] [code]

[2] Nicolay Rusnachenko, Natalia Loukachevitch Using Convolutional Neural Networks for Sentiment Attitude Extraction from Analytical Texts, EPiC Series in Language and Linguistics 4, 1-10, 2019 [paper] [code]

[3] Nicolay Rusnachenko, Natalia Loukachevitch Studying Attention Models in Sentiment Attitude Extraction Task Métais E., Meziane F., Horacek H., Cimiano P. (eds) Natural Language Processing and Information Systems. NLDB 2020. Lecture Notes in Computer Science, vol 12089. Springer, Cham [paper] [code]

[4] Nicolay Rusnachenko, Natalia Loukachevitch Attention-Based Neural Networks for Sentiment Attitude Extraction using Distant Supervision The 10th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020), June 30-July 3 (arXiv:2006.13730) [paper] [code]

[5] Nicolay Rusnachenko, Natalia Loukachevitch, Elena Tutubalina Distant Supervision for Sentiment Attitude Extraction Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019) [paper] [code]

[6] Nicolay Rusnachenko Language Models Application in Sentiment Attitude Extraction Task Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2021;33(3):199-222. (In Russ.) [paper] [code-networks] [code-bert]

[7] Bowen Zhang, Daijun Ding, Liwen Jing How would Stance Detection Techniques Evolve after the Launch of ChatGPT? [paper]

Back to Top

rusentrel-leaderboard's People

Contributors

nicolay-r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rusentrel-leaderboard's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.