laihuiyuan / pre-trained-formality-transfer Goto Github PK

View Code? Open in Web Editor NEW

32.0 2.0 6.0 337.51 MB

Thank you BART! Rewarding Pre-Trained Models Improves Formality Style Transfer (ACL 2021)

Home Page: https://arxiv.org/abs/2105.06947

License: MIT License

Python 93.39% Shell 1.16% Perl 5.46%

text-style-transfer pre-trained-model bart

pre-trained-formality-transfer's Issues

Questions about cal_sc_loss?

Hey, can you explain cal_sc_loss? What is the calculation process of loss? The y' generated by the decoder has been discretized. Why can we calculate its gradient to fine-tuning the language model?

Hello, because I haven't obtained GYAFC dataset yet, I can't train the model at present. Could you please provide me with your trained model? I want to test the data of other datasets with the model.
Thank you very much for your help！

How to calculate the accuracy according to multiple references?

Hi, I found one of the metric is the style accuracy to evaluate whether the generated sentence and the reference sentence has the same style label. The GYAFC dataset has four reference sentences and I wonder how to get a label of four references?

Questions about cal_bl_loss?

Hi, cal_bl_loss calls NLTK method sentence_bleu, but sentence_bleu cannot calculate the gradient. I want to ask how cal_bl_loss performs gradient descent?

Could you help me with the cal_reward_loss fucntion？

I found that the reward loss use negative log loss. When minimizing the reward loss, it seems like it encourages lower bleu scores?

Should I tokenize the reference files when computing BLEU ?

Hi, thank you for sharing your code with us. It is very useful.

I have a question regarding the BLEU score.
I noticed that we should do tokenization for prediction files in pg.sh before running multi-bleu.perl:
https://github.com/laihuiyuan/Pre-trained-formality-transfer/blob/2c9531cd16bf304773d55ea6b11ffe3d359c6abe/pg.sh#L19
However, since it is not written in pg.sh, I wonder if the same tokenization should be applied to those reference files. I found the result very bad if I did not tokenize the reference files.

Using the provided output cannot reproduce the result for BLEURT and ACC

Hi authors,

Thank you very much for the work! It's very helpful. I tried to evaluate the outputs you provided in the output folder.

For EM, I get the following results:
ACC: 0.9031
BLUE: 0.7650
BLEURT: 0.044

I think the result I obtained matches the reported result under BLUE. However, the reported result for ACC is 0.929, and BLEURT is 0.274.

I used the following command to calculate the ACC: "python classifier/test.py -dataset em -order 0.0.0",
and I use the following command to calculate the BLEURT: "python utils/cal_bleurt.py data/em/outputs/bart_em.0 data/em/outputs/bart_em.1 data/em/original_ref/formal.ref data/em/original_ref/informal.ref"

"data/em/outputs/bart_em.0" is the tokenized file for the file outputs/bart_em.0.txt, "data/em/outputs/bart_em.1" is the tokenized file for the file outputs/bart_em.1.txt, "data/em/original_ref/formal.ref" is the tokenized file for all the four formal references (formal.ref0, formal.ref1, formal.ref2, formal.ref3, and formal.ref4), and similarly "data/em/original_ref/informal.ref" is the tokenized file for all the four informal references (informal.ref0, informal.ref1, informal.ref2, informal.ref3, and informal.ref4). For the tokenization code, I use the "utils/tokenizer.py" provided by you. The BLEURT is installed from here: https://github.com/google-research/bleurt#installation, and downloaded the "bleurt-base-128" model from https://storage.googleapis.com/bleurt-oss/bleurt-base-128.zip

I also tried to use the detokenized files when calculating the BLEURT score, which has a score of 0.078, which is still below the reported score.

Please let me know if there is anything I did wrong.

Thank you and looking forward to your reply.

Best regards

How did you get the test.0 file?

Hi, thank you for sharing your code with us. It is very useful.

I have a question regarding the input file. How did you get the test.1 file? Is it the "test/formal" file from the GYAFC dataset?
I believe the test.0 file is the "test/informal" file from the GYAFC dataset. Thanks.

Evaluation

Hi,

Thank you very much for the great work. Could you also provided the scripts used to evaluate the performance (scripts to calculate BLEURT, BLEU, ACC, and HM)?

Thank you very much.

laihuiyuan / pre-trained-formality-transfer Goto Github PK

pre-trained-formality-transfer's Issues

Questions about cal_sc_loss?

Request a trained model

How to calculate the accuracy according to multiple references?

Questions about cal_bl_loss?

Could you help me with the cal_reward_loss fucntion？

Should I tokenize the reference files when computing BLEU ?

Using the provided output cannot reproduce the result for BLEURT and ACC

How did you get the test.0 file?

Evaluation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent