Coder Social home page Coder Social logo

treenlg's Introduction

Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue

Code and dataset supporting the paper:

Anusha Balakrishnan, Jinfeng Rao, Kartikeya Upasani, Michael White and Rajen Subba. Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue.

If you find this code or dataset useful in your research, please consider citing our paper.

Reference

@inproceedings{balakrishnan-etal-2019-constrained,
  title = "Constrained Decoding for Neural {NLG} from Compositional Representations in Task-Oriented Dialogue",
  author = "Balakrishnan, Anusha  and
    Rao, Jinfeng  and
    Upasani, Kartikeya  and
    White, Michael  and
    Subba, Rajen",
  booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
  month = jul,
  year = "2019",
  address = "Florence, Italy",
  publisher = "Association for Computational Linguistics",
  url = "https://www.aclweb.org/anthology/P19-1080",
  doi = "10.18653/v1/P19-1080",
  pages = "831--844"
}

Data

In addition to the weather and enriched E2E challenge dataset from our paper, we released another weather_challenge dataset, which contains harder weather scenarios in train/val/test files. Each response was collected by providing annotators, who are native English speakers, with a user query, and a compositional meaning representation (with discourse relations and dialog acts). All of these are made available in our dataset. See our linked paper for more details.

Data Statistics

Dataset Train Val Test Disc_Test
Weather 25390 3078 3121 454
Weather_Challenge 32684 3397 3382 -
E2E 42061 4672 4693 230

Disc_Test is a more challenging subset of our test set that contains discourse relations, which is also the subset we report results in Disc column in Table 7 in our paper. Note that there are some minor differences of data statistics to our paper, please use the statistics above.

Note: There are some responses in Weather dataset which are not provided a user query (141/17/18/4 for train/val/test/disc_test, respectively). We simply use a "placeholder" token for those missing user queries.

Code

Computing tree accuracy:

python scripts/compute_tree_acc.py -tsv example/seq2seq_out.tsv

This should give you 94.65% tree accuracy. Output file should be tab-separated with columns id, input, pred.

Constrained Decoding

fairseq should be installed at the very begining, refering to Requirements and Installation of Fairseq. The code is tested on commit 3822db3 of fairseq. Now, get started with the following bash scripts.

bash scripts/prepare.weather.sh
bash scripts/train.weather.lstm.sh
bash scripts/generate.weather.lstm.sh

Results

We noticed that slightly higher numbers can be obtained by tuning hyper-parameters compared to the numbers we reported in our paper. Therefore, we update all the automatic numbers (BLEU and tree accuracy) here and please use numbers below when citing our results. For tree accuracy, we report the number on the whole test set, as well as on two disjoint subsets: no-discourse subset that contains examples without any discourse act; discourse subset contains example with 1+ discourse acts.

Note: The BLEU score is calculated on just the output text, without any of the tree information. We use the BLEU evaluation script provided for the E2E challenge here.

Weather Dataset
Dataset BLEU TreeAcc(whole) TreeAcc(no-discourse) TreeAcc(discourse)
S2S-Tree 76.12 94.00 96.66 86.59
S2S-Constr 76.60 97.15 98.76 94.45
Weather Challenge Dataset
Dataset BLEU TreeAcc(whole) TreeAcc(no-discourse) TreeAcc(discourse)
S2S-Tree 76.75 91.10 96.62 83.3
S2S-Constr 77.45 95.74 98.52 91.61
E2E Dataset
Dataset BLEU TreeAcc(whole) TreeAcc(no-discourse) TreeAcc(discourse)
S2S-Tree 74.58 97.06 99.68 95.28
S2S-Constr 74.69 99.25 99.89 97.78

License

TreeNLG is released under CC-BY-NC-4.0, see LICENSE for details.

treenlg's People

Contributors

jinfengr avatar litesaber15 avatar mwhite14850 avatar znculee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

treenlg's Issues

Evaluation Scripts not working on MacOS

I have trained a model, and attempted to evaluate using bash scripts/generate.weather.lstm.sh but get various errors since it seems this was tested in a UNIX system. My first error was readlink does not have the -f option. This was solved by using greadlink and re-running now shows the tree score but now throws new grep errors:

Tree accuracy: 99.84 (3116 / 3121)                                                                                                                                                      
usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
        [-e pattern] [-f file] [--binary-files=value] [--color=when]
        [--context[=num]] [--directories=action] [--label] [--line-buffered]
        [--null] [pattern] [file ...]
usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
        [-e pattern] [-f file] [--binary-files=value] [--color=when]
        [--context[=num]] [--directories=action] [--label] [--line-buffered]
        [--null] [pattern] [file ...]
EOF encountered in a comment.
Failure rate: 0.00 (0 / 0)

replacing failures from /Users/nguyen/src/npp/TreeNLG/checkpoints/weather.lstm/gen.txt
/Users/nguyen/src/npp/TreeNLG/checkpoints/weather.lstm/gen.txt does not exist```


Any advice or help solving this issue so evaluation is allowed on MacOS would be great.

Issue when generating samples

When I try to use below command:
bash scripts/generate.weather.lstm.sh

I get this error:

 File "/home/mrigank/miniconda3/bin/fairseq-generate", line 33, in <module>
   sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-generate')())
 File "/home/mrigank/code/TreeNLG/fairseq/fairseq_cli/generate.py", line 285, in cli_main
   main(args)
 File "/home/mrigank/code/TreeNLG/fairseq/fairseq_cli/generate.py", line 38, in main
   return _main(args, sys.stdout)
 File "/home/mrigank/code/TreeNLG/fairseq/fairseq_cli/generate.py", line 126, in _main
   generator = task.build_generator(models, args)
TypeError: build_generator() takes 2 positional arguments but 3 were given
Args ::  Namespace(order_constr=False, tsv='scripts/tmp/tsv')                                                                                                                                                
Number of lines:  []
Traceback (most recent call last):
 File "compute_tree_acc.py", line 31, in <module>
   correct / len(lines) * 100, correct, len(lines)
ZeroDivisionError: division by zero
Runtime error (func=(main), adr=3): Divide by zero
Failure rate: 0.00 (0 / 0)

replacing failures from /home/mrigank/code/TreeNLG/checkpoints/weather.lstm/gen.txt
/home/mrigank/code/TreeNLG/checkpoints/weather.lstm/gen.txt does not exist```

Generating BLEU Score

Are there any instructions on generating the bleu score for the weather task? My output for running the commands in the readme after generating is the tree scores but I don't get any .tsv in order to run the bleu score evaluation.

I am also only getting gen.constr.txt as output rather than a tsv file or the expected gen.txt. Any guidance would be great, until then I will try to find my own solution as well.

Two functions 'sequence_to_tree' and 'scenario_to_tree' seems inconsistent with comments

sequence_to_tree's comments says:
[DG_INFORM_2 supposed to ARG:CONDITION_NOT ]
=>
NLGNode("root", children={
NLGNode("[DG_INFORM_2", children={NLGNode("ARG:CONDITION_NOT")})
})

but actual result is:
NLGNode("root", children={
NLGNode("[DG_INFORM_2")
})

scenario_to_tree's comment says:
[DG_INFORM: [ARG_TASK: get_forecast , ARG_TEMP_HIGH: 33 ] ]
=>
NLGNode("root", children={
NLGNode("[1_DG_INFORM", children={
NLGNode("ARG_TEMP_HIGH", children={
NLGNode("33")
}),
NLGNode("ARG_TASK", children={
NLGNode("get_forecast")
})
})
})

but actual result is:
NLGNode("root", children={
NLGNode("[1_DG_INFORM", children={
NLGNode("ARG_TASK", children={
NLGNode("get_forecast , ARG_TEMP_HIGH: 33")
})
})
})

Thanks!

Availability of the user raw query

Hello,

Thanks for providing these very useful resources. Is there any chance that the users' raw queries would be available as part of the datasets (or a subset of the datasets)? I've seen that the seq2seq_out.tsv example has such questions similar to Table 6 in the paper. So I was wondering if this would be possible to include. That would be highly useful.

Thanks!
Hamza

how to change a sentence to a tree?

Hello,
I need to use my own dataset to train the model or run the compute_tree_acc.py, so my problem is how to make the data as a tree type?And how many the type of attributes?I know query annotation is parse automatically by rules, and can I get these rules?

Using the data for commercial purposes

Hello,
Thanks for the paper and the code.
I would like to know if its really possible to use just the data for commercial purposes.
If this means that I'd have to buy the license, how much would it cost? and how would I go about buying it?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.