Coder Social home page Coder Social logo

Comments (13)

takase avatar takase commented on May 27, 2024 1

In our experimental configuration, we use a character-based decoder and use @@@@ as a space (a segmentation marker of words).
(I'm sorry the lack of my explanation. We applied the BPE code to a source side only.)

Thus, you pick up decoder outputs and replace the segmentation marker with the space as follows:

cat decoder_output.txt | sed 's/ //g' | sed 's/\@\@\@\@/ /g'

(decoder_output.txt contains the output hypotheses only)

In addition, you can modify the output length with --desired-length option.

from control-length.

takase avatar takase commented on May 27, 2024 1

That table represents recall-based ROUGE scores i.e., ROUGE-N Average_R in results of the evaluation script.

from control-length.

takase avatar takase commented on May 27, 2024

I think that this error is caused by the difference among vocabulary sizes.
Did you apply my BPE code (which is attached to the gz file, as I remember) to your text file?

from control-length.

OrangeInSouth avatar OrangeInSouth commented on May 27, 2024

Should I fist apply BPE code to my text, then preprocess the text, right?

from control-length.

takase avatar takase commented on May 27, 2024

Yes! Please try such procedure.

from control-length.

takase avatar takase commented on May 27, 2024

I added dictionary file which is used to construct binary file in my experiments.
If you need it, please download the trained model file from my google drive again.
https://drive.google.com/file/d/15Sy8rv6Snw6Nso7T5MxYHSAZDdieXpE7/view?usp=sharing

from control-length.

OrangeInSouth avatar OrangeInSouth commented on May 27, 2024

So nice are you!
I have used the dictionary to construct binary file of test set.
And I get a best_15_nbest.txt after running generate.py.
But the headlines generated seem terrible, there two examples:

S-67 the agenda might be global , but the men@@ u will be malaysian when world leaders meet next week for the asia-pacific economic cooperation forum .
T-67 <> si@@ a-@@ <> ac@@ ific <> con@@ om@@ ic <> o@@ operation forum features sp@@ icy cu@@ is@@ ine
H-67 -0.09418036788702011 a f p @@@@ w o r l d @@@@ n e w s
P-67 -0.8212 -0.1580 -0.0270 -0.0621 -0.0602 -0.0257 -0.0338 -0.0303 -0.0275 -0.0282 -0.0313 -0.0246 -0.0250 -0.0321 -0.0256
S-362 the european union wednesday condemned the slaying of four foreign hostages in chechnya and said it would raise the issue with russia 's foreign minister .
T-362 <> <> condem@@ ns slaying of 4 hostages in <> he@@ chn@@ ya@@ ; were telephone engine@@ er@@ s.
H-362 -0.11744093894958496 e u @@@@ c o n d e m n s @@@@ c h e
P-362 -0.3688 -0.0332 -0.0409 -0.2489 -0.0318 -0.0254 -0.0284 -0.0278 -0.0296 -0.0259 -0.0304 -0.0305 -0.5614 -0.0417 -0.0373 -0.3170

Before use the dictionary file you released to construct binary file, I preprocessed train and valid set of Gigaword and had got a dictionary whose size is 15380. However the size of dictionary you released is 16148.
It seems not avaliable to use your dictionary to preprocess test set on my Gigaword, which may be the reason why I failed to use your checkpoint to generate.

from control-length.

OrangeInSouth avatar OrangeInSouth commented on May 27, 2024

You means just apply BPE to train.article, valid.article and test.article, and then preprocess them with *.title together to get a joint-dictionary?

after doing that, I get a dictionary whose size is 74932......(which is much larger than you released dictionary)

It seems that I misunderstand "We applied the BPE code to a source side only".

from control-length.

takase avatar takase commented on May 27, 2024

Could I confirm your purpose?
I thought that you planned to apply pre-trained model to a text for generation.
But you want to train a model on a new corpus?

from control-length.

OrangeInSouth avatar OrangeInSouth commented on May 27, 2024

My purpose is to apply pre-trained model to a text for generation, rather than train a model on a new corpus.

However, I get terrible generated headlines even if applying the checkpoint and dictionary you released.

Therefor I'm wondering the reason why I failed.

Following are two instructions that I have used to preprocess test file and generate:

  1. preprocess test.article and test.title:
    python preprocess.py --source-lang article --target-lang title --tgtdict /users6/ychuang/program/python/control-length-master/databin_bpe/test/dict.joined.txt --srcdict /users6/ychuang/program/python/control-length-master/databin_bpe/test/dict.joined.txt --testpref /users6/ychuang/program/python/control-length-master/sumdata/DUC2004/test_bpe --destdir /users6/ychuang/program/python/control-length-master/databin_bpe/test

  2. generate:
    CUDA_VISIBLE_DEVICES=0,1,2,3 python generate.py /users6/ychuang/program/python/control-length-master/databin_bpe/test --source-lang article --target-lang title --path /users6/ychuang/program/python/control-length-master/pretrained_model/trained_model_lrpe_pe_averaged.pt --desired-length 15 --batch-size 32 --beam 5 > best_75_nbest.txt

from control-length.

takase avatar takase commented on May 27, 2024

I see.
Could you extract the generated sentences from best_75_nbest.txt with the following commands?
cat best_75_nbest.txt | grep '^H' | sed 's/^H\-//g' | sort -t ' ' -k1,1 -n | cut -f 3- > output.txt

Then, replace segmentation markers into spaces:
cat output.txt | sed 's/ //g' | sed 's/\@\@\@\@/ /g'

I expect that you can obtain sentences whose lengths are 15.
(Or, since our pre-trained model is fragile in the same as other neural encoder-decoders, the model might be broken by unknown inputs)

from control-length.

OrangeInSouth avatar OrangeInSouth commented on May 27, 2024

Hi, I have succeed using checkpoint and dictionary you released to generate headlines with following instructions:

  1. CUDA_VISIBLE_DEVICES=0,1,2,3 python generate.py /users6/ychuang/program/python/control-length-master/databin2/test --source-lang article --target-lang title --path /users6/ychuang/program/python/control-length-master/pretrained_model/trained_model_lrpe_pe_averaged.pt --desired-length 75 --batch-size 32 --beam 5 > best_75_nbest.txt
  2. cat best_75_nbest.txt | grep '^H' | sed 's/^H\-//g' | sort -t ' ' -k1,1 -n | cut -f 3- > output.txt
  3. cat output.txt | sed 's/ //g' | sed 's/\@\@\@\@/ /g' > res_75.txtcat output.txt | sed 's/ //g' | sed 's/\@\@\@\@/ /g' > res_75.txt

Then I used the script in directory 'eval' to evaluate ROUGE of res_75.txt, follow is the instruction and result:

  1. ./eval.sh /users6/ychuang/program/python/control-length-master/eval/testdata /users6/ychuang/program/python/control-length-master/eval/testdata/input.txt 75 /users6/ychuang/program/python/control-length-master/eval /users6/ychuang/software/ROUGE/

  2. ROUGE result:
    a ROUGE-1 Average_R: 0.31056 (95%-conf.int. 0.29791 - 0.32237)
    a ROUGE-1 Average_P: 0.28526 (95%-conf.int. 0.27364 - 0.29666)
    a ROUGE-1 Average_F: 0.29650 (95%-conf.int. 0.28433 - 0.30811)
    a ROUGE-2 Average_R: 0.11021 (95%-conf.int. 0.10188 - 0.11849)
    a ROUGE-2 Average_P: 0.10084 (95%-conf.int. 0.09295 - 0.10858)
    a ROUGE-2 Average_F: 0.10499 (95%-conf.int. 0.09703 - 0.11296)
    a ROUGE-L Average_R: 0.27212 (95%-conf.int. 0.26072 - 0.28356)
    a ROUGE-L Average_P: 0.25024 (95%-conf.int. 0.23987 - 0.26069)
    a ROUGE-L Average_F: 0.25995 (95%-conf.int. 0.24929 - 0.27060)

The ROUGE of DUC2004 in your paper is:
image

My question is that:
The ROUGE you present in your paper is Recall or F1?

from control-length.

OrangeInSouth avatar OrangeInSouth commented on May 27, 2024

Thank you again!

from control-length.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.