Coder Social home page Coder Social logo

Comments (5)

mandarjoshi90 avatar mandarjoshi90 commented on May 28, 2024

Sorry about the late response. Here's the pipleline. $gap_file_prefix points to the path of the GAP file without the tsv prefix. $vocab_file refers to the cased BERT vocab file.

#!/bin/bash
gap_file_prefix=$1
vocab_file=$2
python gap_to_jsonlines.py $gap_file_prefix.tsv $vocab_file
GPU=0 python predict.py bert_base $gap_file_prefix.jsonlines $gap_file_prefix.output.jsonlines
python to_gap_tsv.py $gap_file_prefix.output.jsonlines
python2 ../gap-coreference/gap_scorer.py --gold_tsv $gap_file_prefix.tsv --system_tsv $gap_file_prefix.output.tsv
  1. Table 2 is on test.
  2. The results seem to be off by 0.3 or so for BERT base. Not sure what changed. The genre has very little effect (upto 0.1 IIRC) on the number. I got to 82.4 with the default genre (bc).

from coref.

HaixiaChai avatar HaixiaChai commented on May 28, 2024
  1. I found all 4 numbers of e2e-coref on the first row are exactly the same as the results in the last row of Table 4 in the paper of Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns. But, they said the results are on GAP development set. I think the probability is very low that dev set and test results are totally the same. So could you make sure if results in Table 2 surely are on GAP test set, please?
  2. Thank you for your pipeline and bert_base result. Actually, I also got Overall score of 82.4. It is ok. However, my question is on c2f_coref model. The pipeline could be the same, but the codes should be slightly different for adapting to c2f_coref. Can you reproduce the 4 numbers of c2f-coref model?

Thanks a lot.

from coref.

mandarjoshi90 avatar mandarjoshi90 commented on May 28, 2024
  1. I did not run the e2e-coref model. Looks like we copied from the wrong table for that row. I will amend the paper. We definitely evaluated on the test set for BERT.
  2. I don't have that handy right now, and I'm traveling until mid November. IIRC the only change should be to make sure that each element of the sentences field should be a natural language sentence (as opposed to a paragraph as with bert). This is because c2f-coref contextualizes each sentence independently with LSTMs.

If that doesn't work, I'll take a look after I'm back. Thanks for your patience.

from coref.

HaixiaChai avatar HaixiaChai commented on May 28, 2024
  1. Because gap_to_jsonlines.py file is compatible with tokenizer with None, so I used it. The Overall F1 score I evaluated is 68.5, but not 73.5 on your paper. If you can reproduce it again to have a check on what codes you used, I will be appreciated so much.

from coref.

Hafsa-Masroor avatar Hafsa-Masroor commented on May 28, 2024

@HaixiaChai
Could you please share the detailed steps to test & evaluate this model using GAP data-set? (Want to know what changes were made for environmental setup, commands, data, etc)
I am new to this research area, and want to re-produce the results with both GAP & Onto-notes data-sets. Your valuable help will be appreciated in this regard.

Thanks!

from coref.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.