Code for thesis: Factuality evaluation in machine translation
Use paws_para.txt to generate templates on paws dataset. To generate templates on wmt dataset, unpack news-commentary-v15.de-en.zip first, and run direct_translate.py. You will get a file called "desr.txt", which contains at each line a source text, original reference, a direct translated reference generated by code. All templates on wmt dataset need to be created on this file.
The texts used in the experiments are in the folder "adversarial text".
Run metric_eval.py to reproduce the experiment result on Checklist templates. Change the first paramator of the function "get_data()" to get the result of different phenomenon and dataset.
Run NLI_metric.py to reproduce the NLI output on the Checklist datasets. The results are in the "experiments/nli_result" folder. Run NLI_comp.py to compair different formulas of the NLI metrics. Runcombined_metric.py to test the performance of NLI model combined with other metrics.