Coder Social home page Coder Social logo

google-research-datasets / dstc8-schema-guided-dialogue Goto Github PK

View Code? Open in Web Editor NEW
517.0 38.0 120.0 49.9 MB

The Schema-Guided Dialogue Dataset

License: Creative Commons Attribution Share Alike 4.0 International

Python 100.00%
dataset dialogue dialogue-systems assistant nlp-machine-learning

dstc8-schema-guided-dialogue's Introduction

The Schema-Guided Dialogue Dataset

Contact - [email protected]

Overview

The Schema-Guided Dialogue (SGD) dataset consists of over 20k annotated multi-domain, task-oriented conversations between a human and a virtual assistant. These conversations involve interactions with services and APIs spanning 20 domains, such as banks, events, media, calendar, travel, and weather. For most of these domains, the dataset contains multiple different APIs, many of which have overlapping functionalities but different interfaces, which reflects common real-world scenarios. The wide range of available annotations can be used for intent prediction, slot filling, dialogue state tracking, policy imitation learning, language generation, and user simulation learning, among other tasks for developing large-scale virtual assistants. Additionally, the dataset contains unseen domains and services in the evaluation set to quantify the performance in zero-shot or few-shot settings.

Schema-Guided Dialogue - eXtended (SGD-X) is a benchmark for measuring the robustness of dialogue systems to linguistic variations in schemas. SGD-X extends the SGD dataset with 5 crowdsourced variants for every schema, where variants are semantically similar yet stylistically diverse. Models trained on SGD are evaluated on SGD-X to measure how well they can generalize in a real-world setting, where a large variety of linguistic styles exist.

The datasets are provided "AS IS" without any warranty, express or implied. Google disclaims all liability for any damages, direct or indirect, resulting from the use of this dataset.

Updates

10/19/2021 - SGD-X schemas for measuring robustness to linguistic variations in schemas released, along with a script to convert dialogue annotations according to the new schemas.

07/05/2020 - Test set annotations released. User actions and service calls made during the dialogue are also released for all dialogues.

10/14/2019 - DSTC8 challenge concluded. Details about the submissions to the challenge may be found in the DSTC8 overview paper.

10/07/2019 - Test dataset released without the dialogue state annotations.

07/23/2019 - Train and dev sets are publicly released as part of DSTC8 challenge.

Important Links

Data

The SGD dataset consists of schemas outlining the interface of different APIs and annotated dialogues. The dialogues were generated with the help of a dialogue simulator and paid crowd-workers. The data collection approach is summarized in this paper.

The SGD-X dataset consists of 5 linguistic variants of every schema in the original SGD dataset. Linguistic variants were written by hundreds of paid crowd-workers. In the SGD-X directory, v1 represents the variant closest to the original schemas and v5 the farthest in terms of linguistic distance. To evaluate model performance on SGD-X schemas, dialogues must be converted using the script generate_sgdx_dialogues.py.

Schema Representation

A service or API is essentially a set of functions (called intents), each taking a set of parameters (called slots). A schema is a normalized representation of the interface exposed by a service/API. In addition, the schema also includes natural language descriptions of the included functions and their parameters to outline the semantics of each element. The SGD schemas were manually generated by the dataset creators, and SGD-X schema variants were created by having crowd-workers paraphrase the original schemas. Each schema is represented as a json object containing the following fields:

  • service_name* - A unique name for the service.
  • description - A natural language description of the tasks supported by the service.
  • slots - A list of slots/attributes corresponding to the entities present in the service. Each slot contains the following fields:
    • name - The name of the slot.
    • description - A natural language description of the slot.
    • is_categorical - A boolean value. If true, the slot has a fixed set of possible values.
    • possible_values - List of possible values the slot can take on. If the slot is categorical, this lists all the possible values. If the slot not categorical, it is either an empty list or a small sample of all the values the slot can take on.
  • intents - The list of intents/tasks supported by the service. Each method contains the following fields:
    • name - The name of the intent.
    • description - A natural language description of the intent.
    • is_transactional - A boolean value. If true, the underlying API call is transactional (e.g, a booking or a purchase), as opposed to a search call.
    • required_slots - A list of slot names whose values must be provided before executing an API call.
    • optional_slots - A dictionary mapping slot names to the default value taken by the slot. These slots are optionally specified by the user, and the user may override the default value. An empty default value allows that slot to take any value by default.
    • result_slots - A list of slot names which are present in the results returned by a call to the service or API.

*service_names follow the form "<domain name>_<number>" (e.g. Banks_2). The number is used to disambiguate services from the same domain. SGD-X variant schemas have two-digit numbers, where the first digit is copied from the original schema, and the second digit is the SGD-X variant number. For example, the v1 variant of Banks_2 is Banks_21.

Dialogue Representation

Dialogues are represented as a list of turns, where each turn contains either a user or system utterance. The annotations for a turn are grouped into frames, where each frame corresponds to a single service. Each turn in the single domain dataset contains exactly one frame. In multi-domain datasets, some turns may have multiple frames.

Each dialogue is represented as a json object with the following fields:

  • dialogue_id - A unique identifier for a dialogue.
  • services - A list of services present in the dialogue.
  • turns - A list of annotated system or user utterances.

Each turn consists of the following fields:

  • speaker - The speaker for the turn. Possible values are "USER" or "SYSTEM".
  • utterance - A string containing the natural language utterance.
  • frames - A list of frames, where each frame contains annotations for a single service.

Each frame consists of the following fields:

  • service - The name of the service corresponding to the frame. The slots and intents used in the following fields are taken from the schema of this service.
  • slots - A list of slot spans in the utterance, only provided for non-categorical slots. Each slot span contains the following fields:
    • slot - The name of the slot.
    • start - The index of the starting character in the utterance corresponding to the slot value.
    • exclusive_end - The index of the character just after the last character corresponding to the slot value in the utterance. In python, utterance[start:exclusive_end] gives the slot value.
  • actions - A list of actions corresponding to the system. Each action has the following fields:
    • act - The type of action. The list of all possible system acts is given below.
    • slot (optional) - A slot argument for some of the actions.
    • values (optional) - A list of values assigned to the slot. If the values list is non-empty, then the slot must be present.
    • canonical_values (optional) - The values in their canonicalized form as used by the service. It is a list of strings of the same length as values.
  • service_call (system turns only, optional) - The request sent to the service. It consists of the following fields:
    • method - The name of the intent or function of the service or API being executed.
    • parameters - A dictionary mapping slot name (all required slots and possibly some optional slots) to a value in its canonicalized form.
  • service_results (system turns only, optional) - A list of entities containing the results obtained from the service. It is only available for turns in which a service call is made. Each entity is represented as a dictionary mapping a slot name to a string containing its canonical value.
  • state (user turns only) - The dialogue state corresponding to the service. It consists of the following fields:
    • active_intent - The intent corresponding to the service of the frame which is currently being fulfilled by the system. It takes the value "NONE" if none of the intents are active.
    • requested_slots - A list of slots requested by the user in the current turn.
    • slot_values - A dictionary mapping slot name to a list of strings. For categorical slots, this list contains a single value assigned to the slot. For non-categorical slots, all the values in this list are spoken variations of each other and are equivalent (e.g, "6 pm", "six in the evening", "evening at 6" etc.).

List of possible system acts:

  • INFORM - Inform the value for a slot to the user. The slot and values fields in the corresponding action are always non-empty.
  • REQUEST - Request the value of a slot from the user. The corresponding action always contains a slot, but values are optional. When values are present, they are used as examples for the user e.g, "Would you like to eat indian or chinese food or something else?"
  • CONFIRM - Confirm the value of a slot before making a transactional service call.
  • OFFER - Offer a certain value for a slot to the user. The corresponding action always contains a slot and a list of values for that slot offered to the user.
  • NOTIFY_SUCCESS - Inform the user that their request was successful. Slot and values are always empty in the corresponding action.
  • NOTIFY_FAILURE - Inform the user that their request failed. Slot and values are always empty in the corresponding action.
  • INFORM_COUNT - Inform the number of items found that satisfy the user's request. The corresponding action always has "count" as the slot, and a single element in values for the number of results obtained by the system.
  • OFFER_INTENT - Offer a new intent to the user. Eg, "Would you like to reserve a table?". The corresponding action always has "intent" as the slot, and a single value containing the intent being offered. The offered intent belongs to the service corresponding to the frame.
  • REQ_MORE - Asking the user if they need anything else. Slot and values are always empty in the corresponding action.
  • GOODBYE - End the dialogue. Slot and values are always empty in the corresponding action.

List of possible user acts:

  • INFORM_INTENT - Express the desire to perform a certain task to the system. The action always has "intent" as the slot and a single value containing the intent being informed.
  • NEGATE_INTENT - Negate the intent which has been offered by the system.
  • AFFIRM_INTENT - Agree to the intent which has been offered by the system.
  • INFORM - Inform the value of a slot to the system. The slot and values fields in the corresponding action are always non-empty.
  • REQUEST - Request the value of a slot from the system. The corresponding action always contains a slot parameter. It may optionally contain a value, in which case, the user asks the system if the slot has the specified value.
  • AFFIRM - Agree to the system's proposition. Slot and values are always empty.
  • NEGATE - Deny the system's proposal. Slot and values are always empty.
  • SELECT - Select a result being offered by the system. The corresponding action may either contain no parameters, in which case all the values proposed by the system are being accepted, or it may contain a slot and value parameters, in which case the specified slot and value are being accepted.
  • REQUEST_ALTS - Ask for more results besides the ones offered by the system. Slot and values are always empty.
  • THANK_YOU - Thank the system. Slot and values are always empty.
  • GOODBYE - End the dialogue. Slot and values are always empty.

License

The SGD and SGD-X datasets are released under CC BY-SA 4.0 license. For the full license, see LICENSE.txt. Please cite the following papers if you use the datasets in your work:

SGD

@inproceedings{rastogi2020towards,
  title={Towards scalable multi-domain conversational agents: The schema-guided dialogue dataset},
  author={Rastogi, Abhinav and Zang, Xiaoxue and Sunkara, Srinivas and Gupta, Raghav and Khaitan, Pranav},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={05},
  pages={8689--8696},
  year={2020}
}

SGD-X

@inproceedings{lee2022sgd,
  title={SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems},
  author={Lee, Harrison and Gupta, Raghav and Rastogi, Abhinav and Cao, Yuan and Zhang, Bin and Wu, Yonghui},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={36},
  number={10},
  pages={10938--10946},
  year={2022}
}

Dataset Metadata

The following table is necessary for this dataset to be indexed by search engines such as Google Dataset Search.

property value
name Schema-Guided Dialogue Dataset
alternateName SGD dataset
url
sameAs https://github.com/google-research-datasets/dstc8-schema-guided-dialogue
description The dataset consists of conversations between a virtual assistant and a user ranging over a variety of domains including Travel, Events, Payment, Media, Restaurants, Weather etc. Annotations for natural language understanding, dialogue state tracking, policy learning, natural language generation and user simulation learning are also included.
provider
property value
name Google
sameAs https://en.wikipedia.org/wiki/Google
citation https://identifiers.org/arxiv:1909.05855

dstc8-schema-guided-dialogue's People

Contributors

abhirast avatar amolwankhede avatar chrisgorgo avatar harjlee avatar limiao06 avatar ssunkara1 avatar xiaoxuezang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dstc8-schema-guided-dialogue's Issues

Is ‘requested slots F1’ required?

I did not predict requested slots since this metric is optional to report in the final paper according to README.md. However, when I submit my results, I have to report this metric. So, is ‘requested slots F1’ required or optional?

Annotate error for non-categorical slots in dev set

Hi, I found some annotate errors for non-categorical slots, which are most for "destination_city/origin_city", some examples like :

  • 1_00041 USER I want to fly from Mexico City to Seattle on the 11th of March. destination_city ['Mexico City']
  • 1_00108 USER I want to go from Los Angeles to Delhi origin_city ['Delhi']
  • 1_00116 USER I'm going to New York from Seattle. origin_city ['New York']
  • 1_00116 USER I'm going to New York from Seattle. destination_city ['Seattle']
  • 1_00041 USER I want to fly from Mexico City to Seattle on the 11th of March. origin_city ['Seattle']
  • 1_00108 USER I want to go from Los Angeles to Delhi destination_city ['Los Angeles']

Maybe there're also annotation errors in training set, but I did not check it.

Dialogue simulator release

Hi,

I think the dialogue simulation is one of the most valuable contribution of the dataset to dialogue system research.
Since the dialogue simulator code release seems to be on the roadmap as mentioned before,
I believe many peoples are waiting for the dialogue simulation code for the reproduction and contribution to this important topic.

Any update on the release plan of dialogue simulator code?

Best

label error

Dear organizers,
In dialogue 93_00003, we can't find pets_welcome information in utterances, this session has label mistakes?
{ "dialogue_id": "93_00003", "services": [ "Travel_1", "Hotels_3" ], "turns": [ { "frames": [ { "service": "Travel_1", "slots": [], "state": { "active_intent": "FindAttractions", "requested_slots": [], "slot_values": {} } } ], "speaker": "USER", "utterance": "I'd like to look up some attractions to see." }, { "frames": [ { "actions": [ { "act": "REQUEST", "slot": "location", "values": [] } ], "service": "Travel_1", "slots": [] } ], "speaker": "SYSTEM", "utterance": "Okay. Where should I search?" }, { "frames": [ { "service": "Travel_1", "slots": [ { "exclusive_end": 29, "slot": "location", "start": 19 } ], "state": { "active_intent": "FindAttractions", "requested_slots": [], "slot_values": { "location": [ "London, UK" ] } } } ], "speaker": "USER", "utterance": "Please look around London, UK." }, { "frames": [ { "actions": [ { "act": "OFFER", "slot": "attraction_name", "values": [ "30 St Mary Axe (The Gherkin)" ] }, { "act": "OFFER", "slot": "category", "values": [ "Historical Landmark" ] } ], "service": "Travel_1", "slots": [ { "exclusive_end": 57, "slot": "attraction_name", "start": 29 } ] } ], "speaker": "SYSTEM", "utterance": "I've heard good things about 30 St Mary Axe (The Gherkin). It is a Historical Landmark." }, { "frames": [ { "service": "Travel_1", "slots": [], "state": { "active_intent": "FindAttractions", "requested_slots": [], "slot_values": { "good_for_kids": [ "True" ], "location": [ "London, UK" ] } } } ], "speaker": "USER", "utterance": "That sounds nice. Is there anything else that's kid-friendly?" }, { "frames": [ { "actions": [ { "act": "OFFER", "slot": "attraction_name", "values": [ "Alexandra Palace" ] }, { "act": "OFFER", "slot": "category", "values": [ "Performing Arts Venue" ] }, { "act": "INFORM_COUNT", "slot": "count", "values": [ "9" ] } ], "service": "Travel_1", "slots": [ { "exclusive_end": 69, "slot": "attraction_name", "start": 53 } ] } ], "speaker": "SYSTEM", "utterance": "I've found 9 other attractions. First on the list is Alexandra Palace. It is a Performing Arts Venue." }, { "frames": [ { "service": "Travel_1", "slots": [], "state": { "active_intent": "FindAttractions", "requested_slots": [ "free_entry" ], "slot_values": { "good_for_kids": [ "True" ], "location": [ "London, UK" ] } } } ], "speaker": "USER", "utterance": "Is that a free attraction?" }, { "frames": [ { "actions": [ { "act": "INFORM", "slot": "free_entry", "values": [ "False" ] } ], "service": "Travel_1", "slots": [] } ], "speaker": "SYSTEM", "utterance": "No, there is an entry fee there." }, { "frames": [ { "service": "Hotels_3", "slots": [], "state": { "active_intent": "SearchHotel", "requested_slots": [], "slot_values": { "location": [ "London, UK" ], "number_of_rooms": [ "3" ], "pets_welcome": [ "True" ] } } }, { "service": "Travel_1", "slots": [], "state": { "active_intent": "FindAttractions", "requested_slots": [], "slot_values": { "good_for_kids": [ "True" ], "location": [ "London, UK" ] } } } ], "speaker": "USER", "utterance": "Oh, okay. I need to also find a hotel out there. I need to get 3 different rooms." }, { "frames": [ { "actions": [ { "act": "OFFER", "slot": "hotel_name", "values": [ "45 Park Lane" ] }, { "act": "OFFER", "slot": "average_rating", "values": [ "4.7" ] } ], "service": "Hotels_3", "slots": [ { "exclusive_end": 42, "slot": "hotel_name", "start": 30 }, { "exclusive_end": 66, "slot": "average_rating", "start": 63 } ] } ], "speaker": "SYSTEM", "utterance": "Sure. What do you think about 45 Park Lane. It has a rating of 4.7." }, { "frames": [ { "service": "Hotels_3", "slots": [], "state": { "active_intent": "SearchHotel", "requested_slots": [], "slot_values": { "hotel_name": [ "45 Park Lane" ], "location": [ "London, UK" ], "number_of_rooms": [ "3" ], "pets_welcome": [ "True" ] } } } ], "speaker": "USER", "utterance": "That sounds just fine." }, { "frames": [ { "actions": [ { "act": "REQ_MORE", "slot": "", "values": [] } ], "service": "Hotels_3", "slots": [] } ], "speaker": "SYSTEM", "utterance": "Is there anything else I can do to help?" }, { "frames": [ { "service": "Hotels_3", "slots": [ { "exclusive_end": 46, "slot": "check_out_date", "start": 33 } ], "state": { "active_intent": "ReserveHotel", "requested_slots": [], "slot_values": { "check_out_date": [ "11th of March" ], "hotel_name": [ "45 Park Lane" ], "location": [ "London, UK" ], "number_of_rooms": [ "3" ], "pets_welcome": [ "True" ] } } } ], "speaker": "USER", "utterance": "I want to book a hotel until the 11th of March." }, { "frames": [ { "actions": [ { "act": "REQUEST", "slot": "check_in_date", "values": [] } ], "service": "Hotels_3", "slots": [] } ], "speaker": "SYSTEM", "utterance": "When do you expect to check in?" }, { "frames": [ { "service": "Hotels_3", "slots": [ { "exclusive_end": 38, "slot": "check_in_date", "start": 22 } ], "state": { "active_intent": "ReserveHotel", "requested_slots": [], "slot_values": { "check_in_date": [ "Monday next week" ], "check_out_date": [ "11th of March" ], "hotel_name": [ "45 Park Lane" ], "location": [ "London, UK" ], "number_of_rooms": [ "3" ], "pets_welcome": [ "True" ] } } } ], "speaker": "USER", "utterance": "I'll need it starting Monday next week." }, { "frames": [ { "actions": [ { "act": "CONFIRM", "slot": "hotel_name", "values": [ "45 Park Lane" ] }, { "act": "CONFIRM", "slot": "location", "values": [ "London" ] }, { "act": "CONFIRM", "slot": "check_in_date", "values": [ "March 4th" ] }, { "act": "CONFIRM", "slot": "check_out_date", "values": [ "March 11th" ] }, { "act": "CONFIRM", "slot": "number_of_rooms", "values": [ "3" ] } ], "service": "Hotels_3", "slots": [ { "exclusive_end": 64, "slot": "hotel_name", "start": 52 }, { "exclusive_end": 74, "slot": "location", "start": 68 }, { "exclusive_end": 93, "slot": "check_in_date", "start": 84 }, { "exclusive_end": 115, "slot": "check_out_date", "start": 105 } ] } ], "speaker": "SYSTEM", "utterance": "Okay, let me make sure. You want to book 3 rooms at 45 Park Lane in London starting March 4th and ending March 11th." }, { "frames": [ { "service": "Hotels_3", "slots": [], "state": { "active_intent": "ReserveHotel", "requested_slots": [ "phone_number" ], "slot_values": { "check_in_date": [ "March 4th", "Monday next week" ], "check_out_date": [ "11th of March", "March 11th" ], "hotel_name": [ "45 Park Lane" ], "location": [ "London", "London, UK" ], "number_of_rooms": [ "3" ], "pets_welcome": [ "True" ] } } } ], "speaker": "USER", "utterance": "That is correct. What's the phone number for the hotel?" }, { "frames": [ { "actions": [ { "act": "INFORM", "slot": "phone_number", "values": [ "+44 20 7493 4545" ] }, { "act": "NOTIFY_SUCCESS", "slot": "", "values": [] } ], "service": "Hotels_3", "slots": [ { "exclusive_end": 58, "slot": "phone_number", "start": 42 } ] } ], "speaker": "SYSTEM", "utterance": "I made your reservation and the number is +44 20 7493 4545." }, { "frames": [ { "service": "Hotels_3", "slots": [], "state": { "active_intent": "ReserveHotel", "requested_slots": [], "slot_values": { "check_in_date": [ "March 4th", "Monday next week" ], "check_out_date": [ "11th of March", "March 11th" ], "hotel_name": [ "45 Park Lane" ], "location": [ "London", "London, UK" ], "number_of_rooms": [ "3" ], "pets_welcome": [ "True" ] } } } ], "speaker": "USER", "utterance": "Thanks for the help." }, { "frames": [ { "actions": [ { "act": "REQ_MORE", "slot": "", "values": [] } ], "service": "Hotels_3", "slots": [] } ], "speaker": "SYSTEM", "utterance": "Is there anything else you need?" }, { "frames": [ { "service": "Hotels_3", "slots": [], "state": { "active_intent": "NONE", "requested_slots": [], "slot_values": { "check_in_date": [ "March 4th", "Monday next week" ], "check_out_date": [ "11th of March", "March 11th" ], "hotel_name": [ "45 Park Lane" ], "location": [ "London", "London, UK" ], "number_of_rooms": [ "3" ], "pets_welcome": [ "True" ] } } } ], "speaker": "USER", "utterance": "No. That will be all." }, { "frames": [ { "actions": [ { "act": "GOODBYE", "slot": "", "values": [] } ], "service": "Hotels_3", "slots": [] } ], "speaker": "SYSTEM", "utterance": "Have a wonderful day!" } ] },

Couldn't reproduce the multiWOZ 2.1 results

Hello, I had run the baseline code you've provided. The joint goal accuracy goes up to only 0.12 at 50000-ckpt. when I trained further to 100000 or even to 200000, the joint goal accuracy dropped to near zero. I evaluate the predicted state using your evaluation script. Can you give me some hints about how you trained the model (like which global step can reproduce the result? )?
I followed the instructions in the README.md directly.

Slots not in required/optional list do appear in states

Hi, I found some slots not in required/optional list appeared in states too, like (Buses_1, FindBus, leaving_time in dial_2_00079), hope you can publish another accurately-cleaned dataset soon (this work can be done easily). Thanks!

training and eval dataset setting

1、Rules 3:Participants are allowed to use any external datasets, resources or pre-trained models.
2、the FILE_RANGES

{
    "dstc8_single_domain": {
        "train": range(1, 44),
        "dev": range(1, 8),
        "test": range(1, 8)
    },
    "dstc8_multi_domain": {
        "train": range(44, 128),
        "dev": range(8, 21),
        "test": range(8, 21)
    },
    "dstc8_all": {
        "train": range(1, 128),
        "dev": range(1, 21),
        "test": range(1, 21)
    }
}

Question: can i use dstc8_all[train] (range(1, 128)) for training, and eval on dstc8_single_domain[test] or dstc8_multi_domain[test]?

No license

Hello,

Can you please provide a license for this dataset? I looked everywhere online and I couldn't find anything.

Best,
Hadrien

Couldn't reproduce results

I was able to get 0.486 Joint GA with SGD-S on the dev set stated in the original version of the paper. But the updated paper version now states 0.356 on the test set, while I'm getting only about 0.14.

Did you get 0.356 with the original checkpoint?
Any changes you made to the model/hyperparams?

Thanks.

Joint Goal evaluation

In situations where the joint goal at turn t is predicted wrong but it may be possible to predict the joint goal at turn t+1. I believe, in joint goal accuracy, all turns after the first wrong prediction are to treated as wrong predictions. I notice that this is not addressed in the evaluation script. Is it normal?

Unclear state updates for Flight services

Hi!

I have found several dialogues, in which there is an unclear (moreover, inconsistent) update of slot values in the dialogue state.

  1. In dialogue train/15_00068 (service Flights_1), turn 5, the System offers a flight and provides values for two slots, outbound_departure_time and price. In turn 6, the user confirms, but only the slot outbound_departure_time gets filled in the dialogue state.

  2. In dialogue dev/1_00068 (service Flights_3), the same situation happens for turns 3-4. However, neither of the slots outbound_departure_time and price get filled.

It seems that it would be natural for both slots to get filled. What is the reasoning here?

The checking manner when evaluate the non-categorical slots

Hi, I have run your published eval scripts. And it seems that the scripts using both the prediction for "start" and "end" position in 'slots' results but also "textual" values in "state" results when evaluate the non-categorical slots, so why do not only use the predicted "textual" values in "state" predictions?

category slots belief state error

When reviewing bad cases(model prediction VS ground truth),i found a number of dialogues exists belief state error: 1_00105、4_00020、4_00000、3_00112、1_00029、6_00062、3_00115,etc. mainly category slots error.
Take 3_00115 for example (a fragment as below), number_of_baths slot should be "1", but the belief state missed.
What I am concerned about is that if a certain percentage of errors are made, it may affect the effectiveness of the model.

Can you check this?

{
        "frames": [
          {
            "slots": [
              {
                "exclusive_end": 51,
                "slot": "property_name",
                "start": 39
              },
              {
                "exclusive_end": 93,
                "slot": "address",
                "start": 70
              },
              {
                "exclusive_end": 154,
                "slot": "rent",
                "start": 148
              }
            ],
            "actions": [
              {
                "slot": "property_name",
                "act": "OFFER",
                "values": [
                  "Fremont Arms"
                ]
              },
              {
                "slot": "address",
                "act": "OFFER",
                "values": [
                  "37811 Fremont Boulevard"
                ]
              },
              {
                "slot": "number_of_beds",
                "act": "OFFER",
                "values": [
                  "1"
                ]
              },
              {
                "slot": "number_of_baths",
                "act": "OFFER",
                "values": [
                  "1"
                ]
              },
              {
                "slot": "rent",
                "act": "OFFER",
                "values": [
                  "$2,000"
                ]
              },
              {
                "slot": "count",
                "act": "INFORM_COUNT",
                "values": [
                  "7"
                ]
              }
            ],
            "service": "Homes_1"
          }
        ],
        "utterance": "I've found 7 apartments. One is called Fremont Arms and is located at 37811 Fremont Boulevard. The apartment has 1 bed room, 1 bath room, and costs $2,000 per month.",
        "speaker": "SYSTEM"
      },
      {
        "frames": [
          {
            "state": {
              "active_intent": "FindApartment",
              "slot_values": {
                "pets_allowed": [
                  "True"
                ],
                "number_of_beds": [
                  "1"
                ],
                "area": [
                  "Fremont"
                ],
                "property_name": [
                  "Fremont Arms"
                ]
              },
              "requested_slots": []
            },
            "slots": [],
            "service": "Homes_1"
          }
        ],
        "utterance": "That one sounds good.",
        "speaker": "USER"
      }

Slot value don't exactly map to utterance words

File: train/dialogues_059.json
Dialogue ID: 59_00125
Turn: 14

The slot pickup_location contains a value Hartsfield\u2013Jackson.

Utterances in this dialogue use ascii hyphen - for the slot value, but annotation has unicode \u2013.

Release of the backend database

With the challenge now closed, it would be great to get a copy of the database used for the API calls. This would greatly extend the utility of this dataset allowing, for example, its use for reinforcement learning (RL) of the dialogue policy.

empty utterance

file train/dialogues_21, 28 ,29, 30.json have dialogues with empty user utterances. file train/dialogues_21, 26, 28, 29, 30, 31.json have dialogues with empty system utterances.

how to train on different domains?

Hi @abhirast ,

Thanks for your nice dataset and baseline model! Could you pls tell me how to reproduce the experiments on different single domain (i.g., Tab 5 in the paper)? Maybe by setting the FILE_RANGES in the data_utils.py, but it isn't very cool and convenient.
Thanks!

Testing process

Any update for the test phase? Registration, rules, submit methods?

Is there data base (DB) for learning the dialog policy?

Hi @abhirast ,

Thanks for the valuable dataset. I wonder if there is any data base (DB) for the agent to perform the dialog policy learning. If not, could you pls tell me how the actions (including act/slot/values) are annotated? It is highly desirable to evaluate the performance of policy optimization in the SGD dataset if possible.

Thanks.

label error

In 12_00115, the theater name show be labelled with "THE LOT City Center" rather than "dontcare"
{
"dialogue_id": "12_00115",
"services": [
"Movies_1"
],
"turns": [
{
"frames": [
{
"service": "Movies_1",
"slots": [],
"state": {
"active_intent": "FindMovies",
"requested_slots": [],
"slot_values": {}
}
}
],
"speaker": "USER",
"utterance": "I want to find a movie to watch"
},
{
"frames": [
{
"actions": [
{
"act": "REQUEST",
"slot": "location",
"values": []
}
],
"service": "Movies_1",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "Would you prefer a certain location?"
},
{
"frames": [
{
"service": "Movies_1",
"slots": [
{
"exclusive_end": 27,
"slot": "location",
"start": 18
}
],
"state": {
"active_intent": "FindMovies",
"requested_slots": [],
"slot_values": {
"location": [
"Brentwood"
]
}
}
}
],
"speaker": "USER",
"utterance": "Can you search in Brentwood."
},
{
"frames": [
{
"actions": [
{
"act": "OFFER",
"slot": "movie_name",
"values": [
"Breakthrough",
"Captain Marvel",
"Dumbo"
]
}
],
"service": "Movies_1",
"slots": [
{
"exclusive_end": 22,
"slot": "movie_name",
"start": 10
},
{
"exclusive_end": 38,
"slot": "movie_name",
"start": 24
},
{
"exclusive_end": 47,
"slot": "movie_name",
"start": 42
}
]
}
],
"speaker": "SYSTEM",
"utterance": "How about Breakthrough, Captain Marvel or Dumbo?"
},
{
"frames": [
{
"service": "Movies_1",
"slots": [
{
"exclusive_end": 42,
"slot": "location",
"start": 33
}
],
"state": {
"active_intent": "FindMovies",
"requested_slots": [],
"slot_values": {
"location": [
"San Ramon"
],
"theater_name": [
"dontcare"
]
}
}
}
],
"speaker": "USER",
"utterance": "Please find some other movies in San Ramon. Any theatre is fine."
},
{
"frames": [
{
"actions": [
{
"act": "OFFER",
"slot": "movie_name",
"values": [
"Hellboy",
"Little",
"Missing Link"
]
}
],
"service": "Movies_1",
"slots": [
{
"exclusive_end": 17,
"slot": "movie_name",
"start": 10
},
{
"exclusive_end": 25,
"slot": "movie_name",
"start": 19
},
{
"exclusive_end": 41,
"slot": "movie_name",
"start": 29
}
]
}
],
"speaker": "SYSTEM",
"utterance": "How about Hellboy, Little or Missing Link?"
},
{
"frames": [
{
"service": "Movies_1",
"slots": [],
"state": {
"active_intent": "FindMovies",
"requested_slots": [],
"slot_values": {
"location": [
"San Ramon"
],
"show_type": [
"3d"
],
"theater_name": [
"dontcare"
]
}
}
}
],
"speaker": "USER",
"utterance": "I want to watch 3d show, pls find other movies"
},
{
"frames": [
{
"actions": [
{
"act": "OFFER",
"slot": "movie_name",
"values": [
"Shazam!",
"Teen Spirit",
"The Curse of La Llorona"
]
},
{
"act": "INFORM_COUNT",
"slot": "count",
"values": [
"3"
]
}
],
"service": "Movies_1",
"slots": [
{
"exclusive_end": 35,
"slot": "movie_name",
"start": 28
},
{
"exclusive_end": 48,
"slot": "movie_name",
"start": 37
},
{
"exclusive_end": 75,
"slot": "movie_name",
"start": 52
}
]
}
],
"speaker": "SYSTEM",
"utterance": "I found 3 movies. How about Shazam!, Teen Spirit or The Curse of La Llorona?"
},
{
"frames": [
{
"service": "Movies_1",
"slots": [
{
"exclusive_end": 11,
"slot": "movie_name",
"start": 0
}
],
"state": {
"active_intent": "FindMovies",
"requested_slots": [],
"slot_values": {
"location": [
"San Ramon"
],
"movie_name": [
"Teen Spirit"
],
"show_type": [
"3d"
],
"theater_name": [
"dontcare"
]
}
}
}
],
"speaker": "USER",
"utterance": "Teen Spirit is perfect."
},
{
"frames": [
{
"actions": [
{
"act": "OFFER_INTENT",
"slot": "intent",
"values": [
"BuyMovieTickets"
]
}
],
"service": "Movies_1",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "Would you like to purchase the tickets?"
},
{
"frames": [
{
"service": "Movies_1",
"slots": [
{
"exclusive_end": 35,
"slot": "show_date",
"start": 23
}
],
"state": {
"active_intent": "GetTimesForMovie",
"requested_slots": [],
"slot_values": {
"location": [
"San Ramon"
],
"movie_name": [
"Teen Spirit"
],
"show_date": [
"2nd of March"
],
"show_type": [
"3d"
],
"theater_name": [
"dontcare"
]
}
}
}
],
"speaker": "USER",
"utterance": "No. I want to watch on 2nd of March. What time can i watch?"
},
{
"frames": [
{
"actions": [
{
"act": "OFFER",
"slot": "show_time",
"values": [
"5 pm"
]
},
{
"act": "OFFER",
"slot": "theater_name",
"values": [
"THE LOT City Center"
]
},
{
"act": "INFORM_COUNT",
"slot": "count",
"values": [
"1"
]
}
],
"service": "Movies_1",
"slots": [
{
"exclusive_end": 22,
"slot": "show_time",
"start": 18
},
{
"exclusive_end": 45,
"slot": "theater_name",
"start": 26
}
]
}
],
"speaker": "SYSTEM",
"utterance": "I found 1 show at 5 pm in THE LOT City Center."
},
{
"frames": [
{
"service": "Movies_1",
"slots": [],
"state": {
"active_intent": "GetTimesForMovie",
"requested_slots": [],
"slot_values": {
"location": [
"San Ramon"
],
"movie_name": [
"Teen Spirit"
],
"show_date": [
"2nd of March"
],
"show_time": [
"5 pm"
],
"show_type": [
"3d"
],
"theater_name": [
"dontcare"
]
}
}
}
],
"speaker": "USER",
"utterance": "Yes. Good"
},
{
"frames": [
{
"actions": [
{
"act": "REQ_MORE",
"slot": "",
"values": []
}
],
"service": "Movies_1",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "Can i help you with anything else?"
},
{
"frames": [
{
"service": "Movies_1",
"slots": [],
"state": {
"active_intent": "NONE",
"requested_slots": [],
"slot_values": {
"location": [
"San Ramon"
],
"movie_name": [
"Teen Spirit"
],
"show_date": [
"2nd of March"
],
"show_time": [
"5 pm"
],
"show_type": [
"3d"
],
"theater_name": [
"dontcare"
]
}
}
}
],
"speaker": "USER",
"utterance": "No. Thanks a lot"
},
{
"frames": [
{
"actions": [
{
"act": "GOODBYE",
"slot": "",
"values": []
}
],
"service": "Movies_1",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "Have a great day"
}

Which evaluation metric is used on MultiWOZ 2.0 and MultiWOZ 2.1

In the paper provided in ReadMe, the baseline model obtains state-of-the-art joint goal accuracies of 0.516 on MultiWOZ 2.0 and 0.489 on MultiWOZ 2.1 test sets respectively, exceeding the best-known results of 0.486 and 0.456 on these datasets as reported in (Eric et al. 2019).

In the MuliWOZ dataset, the evaluation metric for joint accuracy is evaluating whether all slot values are the exact same with the ground truths at every turn, i.e., the joint accuracy is 1.0 if and only all slot values are predicted correctly, otherwise 0.

However, the evaluation metric (including fuzzy matching score) provided by SGD is quite different from that used on the MultiWOZ dataset.

We are confused about which evaluation metric is used by the paper or baseline model to evaluate the MultiWOZ dataset?

Are the service labels in frames available for multi-domain test set?

According to the "ReadMe" here, the service labels in frames are unblocked when testing, but it was said that models need to select the relevant APIs concerning the multi-domain dialogues in the task proposal document, so I'm wondering that whether are the service(API) labels in frames available for multi-domain test set?

Fatal Python error: _PySys_BeginInit: <stdin> is a directory, cannot continue

I wanted to test the baseline model SG-DST but when I tried to train the model using this command:

python -m schema_guided_dst.baseline.train_and_predict \ --bert_ckpt_dir <downloaded_bert_ckpt_dir> \ --dstc8_data_dir <downloaded_dstc8_data_dir> \ --dialogues_example_dir <output_example_dir> \ --schema_embedding_dir <output_schema_embedding_dir> \ --output_dir <output_ckpt_dir> --dataset_split train --run_mode train \ --task_name dstc8_single_domain

Where:
downloaded_bert_ckpt_dir is cased_L-12_H-768_A-12
and downloaded_dstc8_data_dir is train directory containing dialogues and schema json files.

I got this issue:
Fatal Python error: _PySys_BeginInit: <stdin> is a directory, cannot continue
Anyone have the same problem?

slots belief state error

In following dialogue, the origin city and destination city are annotated reversely.
"dialogue_id": "89_00000",
"services": [
"Buses_2",
"Travel_1"
],
"turns": [
{
"frames": [
{
"service": "Buses_2",
"slots": [],
"state": {
"active_intent": "FindBus",
"requested_slots": [],
"slot_values": {}
}
}
],
"speaker": "USER",
"utterance": "Please help me find a bus."
},
{
"frames": [
{
"actions": [
{
"act": "REQUEST",
"slot": "destination",
"values": []
},
{
"act": "REQUEST",
"slot": "origin",
"values": []
},
{
"act": "REQUEST",
"slot": "departure_date",
"values": []
}
],
"service": "Buses_2",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "Sure, no problem. I'll need the following information: Your departure date, as well as your origin and destination cities."
},
{
"frames": [
{
"service": "Buses_2",
"slots": [
{
"exclusive_end": 32,
"slot": "destination",
"start": 24
},
{
"exclusive_end": 41,
"slot": "origin",
"start": 36
},
{
"exclusive_end": 57,
"slot": "departure_date",
"start": 45
}
],
"state": {
"active_intent": "FindBus",
"requested_slots": [],
"slot_values": {
"departure_date": [
"6th of March"
],
"destination": [
"San Fran"
],
"origin": [
"Vegas"
]
}
}
}
],
"speaker": "USER",
"utterance": "I'd like to travel from San Fran to Vegas on 6th of March."
},
{
"frames": [
{
"actions": [
{
"act": "OFFER",
"slot": "departure_time",
"values": [
"10:50 am"
]
},
{
"act": "OFFER",
"slot": "price",
"values": [
"$50"
]
},
{
"act": "OFFER",
"slot": "fare_type",
"values": [
"Economy"
]
},
{
"act": "INFORM_COUNT",
"slot": "count",
"values": [
"6"
]
}
],
"service": "Buses_2",
"slots": [
{
"exclusive_end": 65,
"slot": "departure_time",
"start": 57
},
{
"exclusive_end": 76,
"slot": "price",
"start": 73
}
]
}
],
"speaker": "SYSTEM",
"utterance": "I've found 6 buses that might interest you. How does the 10:50 am bus at $50 per Economy ticket sound?"
},
{
"frames": [
{
"service": "Buses_2",
"slots": [],
"state": {
"active_intent": "BuyBusTicket",
"requested_slots": [],
"slot_values": {
"departure_date": [
"6th of March"
],
"departure_time": [
"10:50 am"
],
"destination": [
"San Fran"
],
"fare_type": [
"Economy"
],
"group_size": [
"2"
],
"origin": [
"Vegas"
]
}
}
}
],
"speaker": "USER",
"utterance": "I like that. Please get me 2 tickets on that bus."
},
{
"frames": [
{
"actions": [
{
"act": "CONFIRM",
"slot": "origin",
"values": [
"Las Vegas"
]
},
{
"act": "CONFIRM",
"slot": "destination",
"values": [
"San Francisco"
]
},
{
"act": "CONFIRM",
"slot": "departure_date",
"values": [
"March 6th"
]
},
{
"act": "CONFIRM",
"slot": "departure_time",
"values": [
"10:50 am"
]
},
{
"act": "CONFIRM",
"slot": "group_size",
"values": [
"2"
]
},
{
"act": "CONFIRM",
"slot": "fare_type",
"values": [
"Economy"
]
}
],
"service": "Buses_2",
"slots": [
{
"exclusive_end": 101,
"slot": "origin",
"start": 92
},
{
"exclusive_end": 118,
"slot": "destination",
"start": 105
},
{
"exclusive_end": 72,
"slot": "departure_date",
"start": 63
},
{
"exclusive_end": 82,
"slot": "departure_time",
"start": 74
}
]
}
],
"speaker": "SYSTEM",
"utterance": "Sure. Please confirm that you'd like 2 Economy tickets for the March 6th, 10:50 am bus from Las Vegas to San Francisco."
},
{
"frames": [
{
"service": "Buses_2",
"slots": [],
"state": {
"active_intent": "BuyBusTicket",
"requested_slots": [],
"slot_values": {
"departure_date": [
"6th of March",
"March 6th"
],
"departure_time": [
"10:50 am"
],
"destination": [
"San Fran",
"San Francisco"
],
"fare_type": [
"Economy"
],
"group_size": [
"1"
],
"origin": [
"Las Vegas",
"Vegas"
]
}
}
}
],
"speaker": "USER",
"utterance": "No, my travel partner has just cancelled. Please make that 1 ticket instead."
},
{
"frames": [
{
"actions": [
{
"act": "CONFIRM",
"slot": "departure_time",
"values": [
"10:50 am"
]
},
{
"act": "CONFIRM",
"slot": "group_size",
"values": [
"1"
]
}
],
"service": "Buses_2",
"slots": [
{
"exclusive_end": 89,
"slot": "departure_time",
"start": 81
}
]
}
],
"speaker": "SYSTEM",
"utterance": "Okay, no problem. Please confirm that you'd like 1 ticket for the bus leaving at 10:50 am."
},
{
"frames": [
{
"service": "Buses_2",
"slots": [],
"state": {
"active_intent": "BuyBusTicket",
"requested_slots": [],
"slot_values": {
"departure_date": [
"6th of March",
"March 6th"
],
"departure_time": [
"10:50 am"
],
"destination": [
"San Fran",
"San Francisco"
],
"fare_type": [
"Economy"
],
"group_size": [
"1"
],
"origin": [
"Las Vegas",
"Vegas"
]
}
}
}
],
"speaker": "USER",
"utterance": "That is correct."
},
{
"frames": [
{
"actions": [
{
"act": "NOTIFY_SUCCESS",
"slot": "",
"values": []
}
],
"service": "Buses_2",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "I have successfully confirmed your ticket."
},
{
"frames": [
{
"service": "Buses_2",
"slots": [],
"state": {
"active_intent": "BuyBusTicket",
"requested_slots": [],
"slot_values": {
"departure_date": [
"6th of March",
"March 6th"
],
"departure_time": [
"10:50 am"
],
"destination": [
"San Fran",
"San Francisco"
],
"fare_type": [
"Economy"
],
"group_size": [
"1"
],
"origin": [
"Las Vegas",
"Vegas"
]
}
}
},
{
"service": "Travel_1",
"slots": [],
"state": {
"active_intent": "FindAttractions",
"requested_slots": [],
"slot_values": {
"category": [
"dontcare"
],
"good_for_kids": [
"True"
],
"location": [
"San Fran",
"San Francisco"
]
}
}
}
],
"speaker": "USER",
"utterance": "Thank you very much. You've been very helpful. Since my travel partner cancelled, I'd like to take my nephew living in Vegas to visit some attractions. Any child-friendly attractions you can recommend?"
},
{
"frames": [
{
"actions": [
{
"act": "OFFER",
"slot": "attraction_name",
"values": [
"Alcatraz Island"
]
},
{
"act": "OFFER",
"slot": "category",
"values": [
"Historical Landmark"
]
}
],
"service": "Travel_1",
"slots": [
{
"exclusive_end": 29,
"slot": "attraction_name",
"start": 14
}
]
}
],
"speaker": "SYSTEM",
"utterance": "May I suggest Alcatraz Island, a Historical Landmark?"
},
{
"frames": [
{
"service": "Travel_1",
"slots": [],
"state": {
"active_intent": "FindAttractions",
"requested_slots": [],
"slot_values": {
"attraction_name": [
"Alcatraz Island"
],
"category": [
"dontcare"
],
"good_for_kids": [
"True"
],
"location": [
"San Fran",
"San Francisco"
]
}
}
}
],
"speaker": "USER",
"utterance": "Okay, cool. Thanks."
},
{
"frames": [
{
"actions": [
{
"act": "REQ_MORE",
"slot": "",
"values": []
}
],
"service": "Travel_1",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "Would you like my help with anything else?"
},
{
"frames": [
{
"service": "Travel_1",
"slots": [],
"state": {
"active_intent": "NONE",
"requested_slots": [],
"slot_values": {
"attraction_name": [
"Alcatraz Island"
],
"category": [
"dontcare"
],
"good_for_kids": [
"True"
],
"location": [
"San Fran",
"San Francisco"
]
}
}
}
],
"speaker": "USER",
"utterance": "No. Thank you very much. You've been very helpful."
},
{
"frames": [
{
"actions": [
{
"act": "GOODBYE",
"slot": "",
"values": []
}
],
"service": "Travel_1",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "Enjoy yourselves!"
}
]
},

Explanation about user acts

Please update the README. The REQUEST act for user may also have optional values, just like the system REQUEST. Here is an example in dev/dialogues_018.json:

{
        "frames": [
          {
            "actions": [
              {
                "act": "REQUEST",
                "canonical_values": [
                  "2016"
                ],
                "slot": "year",
                "values": [
                  "2016"
                ]
              },
              {
                "act": "REQUEST",
                "canonical_values": [],
                "slot": "genre",
                "values": []
              }
            ],
            "service": "Music_1",
            "slots": [],
            "state": {
              "active_intent": "LookupSong",
              "requested_slots": [
                "genre",
                "year"
              ],
              "slot_values": {}
            }
          }
        ],
        "speaker": "USER",
        "utterance": "What genre is it and was it released three years back?"
      }

Release of actual database data for each schema

As described in the paper, a sql backend is used for the dataset with a subset of real-world values. Is there any plan to release the database? That information is useful for data synthesis approaches.
(I can extract rows from the dialog training data, but it might not be complete. )

Thanks.

"intent" as a slot

Hi team,
I was working with SGD_X and figure out in original SGD, test set of service "Homes_2", there is a slot called "intent", which purpose is to express whether user want to buy or rent houses. However with current SGD_X generate code, we handle slot with value "intent" as an intent, therefore, in V1 to V5, we completely missed out the slot cases.
I would love to make a PR but currently not a google contributor, am asking if you guys can make a commit to fix this bug.

Unseen labels in dev dataset (which does not exist in train dataset)

Some unseen slot labels and intent labels are in dev dataset, so that a trained machine has to predict new labels which have never occurred in train dataset.

those are the list of unseen labels in dev dataset

Unseen intent labels:
{'GetAlarms', 'RentMovie', 'FindAttractions', 'GetWeather', 'AddAlarm'}

Unseen slot labels:
{'director', 'recipient_name', 'starring', 'actors', 'stay_length', 'new_alarm_name', 'transfer_amount, new_alarm_time}

Is this eval result normal?

I just run the evaluate script and get this result.

{
  "15_00000": {
    "dialogue_id": "15_00000",
    "services": [
      "RentalCars_1",
      "Buses_1",
      "Flights_3"
    ],
    "turns": [
      {
        "frames": [
          {
            "service": "RentalCars_1",
            "slots": [
              {
                "exclusive_end": 78,
                "slot": "pickup_date",
                "start": 68
              },
              {
                "exclusive_end": 53,
                "slot": "pickup_city",
                "start": 40
              },
              {
                "exclusive_end": 36,
                "slot": "dropoff_date",
                "start": 26
              }
            ],
            "state": {
              "active_intent": "GetCarsAvailable",
              "requested_slots": [],
              "slot_values": {
                "dropoff_date": "march 12th",
                "pickup_city": "san francisco",
                "pickup_date": "march 11th"
              }
            },
            "metrics": {
              "active_intent_accuracy": 1.0,
              "requested_slots_f1": 1.0,
              "requested_slots_precision": 1.0,
              "requested_slots_recall": 1.0,
              "slot_tagging_f1": 1.0,
              "slot_tagging_precision": 1.0,
              "slot_tagging_recall": 1.0,
              "average_goal_accuracy": 0.16666666666666666,
              "average_cat_accuracy": "NA",
              "average_noncat_accuracy": 0.16666666666666666,
              "joint_goal_accuracy": 0.004536,
              "joint_cat_accuracy": 1.0,
              "joint_noncat_accuracy": 0.004536
            }
          }
        ],
        "speaker": "USER",
        "utterance": "I need a rental car until March 12th in San Francisco. I want it on March 11th."
      }
    ]
  }
}

All slots info is same with 15_00000(turn 1) in dev data and slot tagging F1 is 1, but joint_noncat_accuracy only get 0.004536, is this result normal?

Is there code for processing Taskmaster?

The paper mentions using Taskmaster as another dataset for pretraining. Is there code for preprocessing Taskmaster for training as well? I think there's only code for processing SGD.

Utterance typos

Many substrings of utterance(more than 1000 lines) match the pattern "[a-z0-9].[A-Z]", eg:

"No.Now I don't want.Thank you.That's all."(typos)

"I have 9 hotels for you.First with rating 4.4 is Anaheim Rv Park.Tell me your opinion."(typos)

"Arc The.Hotel"(this is a hotel name, this should be a correct format)

Is this normal?

errors of annotation

Dear,
We find errors of annotation in "83_00069" in the training set, {
"frames": [
{
"service": "Restaurants_1",
"slots": [],
"state": {
"active_intent": "ReserveRestaurant",
"requested_slots": [
"serves_alcohol"
],
"slot_values": {
"city": [
"san Francisco",
"san francisco",
"sfo"
],
"cuisine": [
"szcheuan"
],
"date": [
"march 6th",
"the 6th"
],
"party_size": [
"1"
],
"restaurant_name": [
"bamboo restaurant"
],
"time": [
"6:30 in the evening",
"6:30 pm"
]
}
}
}
],
"speaker": "USER",
"utterance": "yes that works"
},
{
"frames": [
{
"actions": [
{
"act": "INFORM",
"slot": "serves_alcohol",
"values": [
"True"
]
},
{
"act": "NOTIFY_SUCCESS",
"slot": "",
"values": []
}
],
"service": "Restaurants_1",
"slots": []
}
],
"speaker": "SYSTEM",
"utterance": "it was succesful"
},
We think the "requested_slots" is "[]" and then there is no action with "{"act": "INFORM", "slot": "serves_alcohol","values": ["True" ]}"
Thank you for your datasets.

Overlapping APIs between train and dev set

Hi, I find there're overlapping APIs between train/dev set, like Hotels_1, Music_1, Events_1, so does it conflict with the zero-shot learning (scalable to unseen APIs)?

bug in sgd-x transformation

In the original SGD, Homes service has a slot called intent, however, the generate_sgdx_dialogues.py does not correctly handle this:

for action in frame.get('actions', []):
# Replace values if slot is intent.
if 'slot' in action:
if action['slot'] == 'intent':
utils.replace_list_elements_with_mapping(
action.get('canonical_values', []), intent_to_name)
utils.replace_list_elements_with_mapping(
action.get('values', []), intent_to_name)
else:
utils.replace_dict_value_with_mapping(action, 'slot',
slot_to_name)

which means that the intent slot is not transformed for Homes (e.g., to purpose in sgd-v1)
I think this can be fixed by changing if action['slot'] == 'intent': to if action['slot'] == 'intent' and action['act'] in ['OFFER_INTENT', 'INFORM_INTENT']:

Non-categorical slot with possible values

Hi!

In the train split, the service Restaurants_1 has a slot "cuisine", which is not categorical and at the same time has a list of possible values:

{
"name": "cuisine",
"description": "Cuisine of food served in the restaurant",
"is_categorical": false,
"possible_values": [
"Mexican",
"Chinese",
"Indian",
"American",
"Italian"
]
}

I believe this should be fixed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.