monologg / jointbert Goto Github PK

View Code? Open in Web Editor NEW

637.0 13.0 183.0 475 KB

Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"

License: Apache License 2.0

Python 100.00%

bert transformers slot-filling pytorch intent-classification slu joint-bert

jointbert's Introduction

🚀 Things I do

NLP Engineer, contributing on Korean NLP with Open Source!

📬 Find me at

jointbert's People

Contributors

Stargazers

Watchers

Forkers

xw-jia alikhodadoost pttzty rahulraj80 ganeshgs currywu123 dadelani voltek62 elkaito hulumei123 jaireyu arunava-de-e3172 andrii-zaiev 460130107 chenhuayou otto248 hanst chorseng zhuifeng414 zeinabbo yonnie1331 yuweifamily liuhuadong2018 datalama hanjiacheng wangdayeeeeee sunnypiggggy trendingtechnology liushui9404 viviweipan longyun0701 rannielin zeweichu jiashengliu111 fw339wj zeta1999 echomaster ramaneswaran brucejust beyondacm janusio berserker1 flyingwaters aggelosth van-den pr3mar kimsijin33 wentingtseng qytai say543 akumar03 zeinabfarhoudi harshita-555 nicemartin tylerac chunningdu cdqncn pankajkumar nrj-max personx000 jayanthsunchu danlx prithivirajdamodaran a-lakhanpal matteo-grella moolighty murtaza-s sghazuk1 dahrs loganhart02 kumar-shridhar hml-ubt tiffen lei-lab1 niitrr sanger2000 hi-ylf zeng-wh ethanlovequeen nik0spapp 1104luwang robertanto azxky6645 xyh1756 omkarparab-source devi1k lianzhaoy mishra-sid kevinmtian startime-h nyotaai partysu95 jcbu246 982931733 maryammir-o singaln eduard-parunakyan rohit129 lavarith vishnoor

jointbert's Issues

对于英文中一个词被划分成多个部分，使得与标签序列对应不上的问题如何解决？

对于序列标注任务，一个单词被划分为多个部分，导致输入的token序列长于标签序列，如何解决

reslut on snips

Hello!!! I just can't replicate your reslut on snips when using bert + CRF, it's the baseline bert-base-uncased or any tricks when implement training?

"JointBERT" vs. "BertForMaskedLM" in model config json file

Hi,

This is a great repo! I was using your code to train Bert on my own data, but I noticed that the architecture field in the model config json file changed. The code and data are exactly the same, but the architecture changed which is wired.

Please see the attached screenshots.
Do you know why this changed over time?
Thanks!

Issue using shared softmax layer.

Issue using shared softmax layer, the paper has mentioned using multiplication of slot filling softmax's

JointBERT/model.py

Line 52 in d6e06f2

slot_logits = self.slot_classifier(sequence_output)

The original paper code:
https://github.com/MahmoudWahdan/dialog-nlu/blob/7e34ebb3c370abe1044464b9a15c0b445a0a2384/models/joint_bert.py#L65

Is it possible to map a slot with a lookup of different values

I am facing a issue as my slot has large number of values. I am looking for a way to map my slot to a lookup so that we need not put all the categories in the slot while training.

If there is a way please let me know . Thanks in advance.

Any plans to publish the model?

Hello @monologg!

Thanks for releasing this!
Are there any plans to release the trained model at the https://huggingface.co/?

whether the results in tables are run on dev set?

I run your models, but can not achieve your results on test set.

Learning objective

Hi @monologg! Just a theoretical question about what the BERT for Joint Intent Classification and Slot Filling publication says here:

The learning objective is to maximize the conditional probability p(y^i, y^s|x). The model is finetuned end-to-end via minimizing the cross-entropy loss.

If I understand correctly, this is not to sum the intent and slot losses as you have in your models (total_loss = intent_loss + self.args.slot_loss_coef * slot_loss). If that part of the paper is correct, you should first multiply the probabilities calculated from both logits and then use the CrossEntropyLoss over these probabilities.

Weights of JointBERT not initialized from pretrained model: "camembert-base"

I would like to use "camembert-base" instead of "bert-base-uncased" to train my model but the wights of the pretrained model " camembert-base" are not use to initialize JointBERT model :

Do you have idea how can I solve this problem please ?

How to label your own data set in doccano and use this code

Is JointBert save the best model that achieves the best results on dev ?

I would like to know whether if JointBert save the best model that achieves the best results on dev or no ?

issue in training

looks like i m able to train but model is not being save and not able to perform evaluation.
training comand:

python3 main.py --task atis   --model_type albert       --model_dir atis_out  --do_train --do_eval

model_dir is empty after training and hence it cant find model in it.

03/20/2020 21:35:14 - INFO - trainer -   ***** Running training *****
03/20/2020 21:35:14 - INFO - trainer -     Num examples = 4478
03/20/2020 21:35:14 - INFO - trainer -     Num Epochs = 1
03/20/2020 21:35:14 - INFO - trainer -     Total train batch size = 64
03/20/2020 21:35:14 - INFO - trainer -     Gradient Accumulation steps = 1
03/20/2020 21:35:14 - INFO - trainer -     Total optimization steps = 70
Iteration: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 70/70 [00:51<00:00,  1.36it/s]
Epoch: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:51<00:00, 51.52s/it]
Traceback (most recent call last):
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 221, in get_config_dict
    resume_download=resume_download,
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/file_utils.py", line 245, in cached_path
    raise EnvironmentError("file {} not found".format(url_or_filename))
OSError: file atis_out/config.json not found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gamut/Downloads/JointBERT-master/trainer.py", line 228, in load_model
    self.bert_config = self.config_class.from_pretrained(self.args.model_dir)
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 176, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 241, in get_config_dict
    raise EnvironmentError(msg)
OSError: Model name 'atis_out' was not found in model name list. We assumed 'atis_out/config.json' was a path, a model identifier, or url to a configuration file named config.json or a directory containing such a file but couldn't find any such file at this path or url.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 81, in <module>
    main(args)
  File "main.py", line 22, in main
    trainer.load_model()
  File "/home/gamut/Downloads/JointBERT-master/trainer.py", line 236, in load_model
    raise Exception("Some model files might be missing...")
Exception: Some model files might be missing...

Finetuning a trained JointBERT model on other dataset

Hi,
Is there any way to take a trained JointBERT model on SNIPS and finetune it for some other task similar to snips with 7 classes? How can that be implemented in the present code?

Target out of bounds

I have the number of intents as 3. And now I am getting the error:

ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 12 is out of bounds.

any help is appreciated

how calculate sent？

i hard try, but i stilll don't how to calculate. my acc is low,,,,,bad mood....
hope you help me, thanks

Exception: Some model files might be missing...

Hi
I got this error when training this model for ATIS : python3 main.py --task atis --num_train_epochs 10 --model_type bert --model_dir atis_model --do_train --do_eval --use_crf

do you have an idea how can I solve this problem please ?

INFO - transformers.modeling_utils - loading weights file atis_model/pytorch_model.bin
Traceback (most recent call last):
File "tools/git/JointBERT/trainer.py", line 233, in load_model
self.model = self.model_class.from_pretrained(self.args.model_dir)
File "miniconda2/envs/condpy36/lib/python3.6/site-packages/transformers/modeling_utils.py", line 512, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
TypeError: init() missing 3 required positional arguments: 'args', 'intent_label_lst', and 'slot_label_lst'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 71, in
main(args)
File "main.py", line 22, in main
trainer.load_model()
File "tools/git/JointBERT/trainer.py", line 237, in load_model
raise Exception("Some model files might be missing...")
Exception: Some model files might be missing...
Exception: Some model files might be missing...

Not getting performance improvement for multitasking of intent classification and slot filling

When the model is trained for intent detection and slot filling tasks separately, the performance is higher when the model is trained for both the tasks together. I mean for multitasking, there is no performance gain? Any comments or help how to get the performance gain over single-task model would be highly appreciated?

Is something wrong with loss?

I don't know if you've ever encountered this issue in your training, the loss increased but all other metrics decreased. It's weird... I don't know if it affects convergence.

KeyError during training

Hi, thanks for sharing this code repository! I just have a slight issue with KeyErrors during model training using custom data:

06/22/2020 22:16:45 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/msdfamily/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_train_snips_bert-base-uncased_50
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_dev_snips_bert-base-uncased_50
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_test_snips_bert-base-uncased_50
06/22/2020 22:16:46 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/msdfamily/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
06/22/2020 22:16:46 - INFO - transformers.configuration_utils - Model config BertConfig {
"architectures": [
"BertForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"finetuning_task": "snips",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 30522
}

06/22/2020 22:16:46 - INFO - transformers.modeling_utils - loading weights file https://cdn.huggingface.co/bert-base-uncased-pytorch_model.bin from cache at /home/msdfamily/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
06/22/2020 22:16:48 - INFO - transformers.modeling_utils - Weights of JointBERT not initialized from pretrained model: ['intent_classifier.linear.weight', 'intent_classifier.linear.bias', 'slot_classifier.linear.weight', 'slot_classifier.linear.bias']
06/22/2020 22:16:48 - INFO - transformers.modeling_utils - Weights from pretrained model not used in JointBERT: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
06/22/2020 22:16:50 - INFO - trainer - ***** Running training *****
06/22/2020 22:16:50 - INFO - trainer - Num examples = 13084
06/22/2020 22:16:50 - INFO - trainer - Num Epochs = 10
06/22/2020 22:16:50 - INFO - trainer - Total train batch size = 32
06/22/2020 22:16:50 - INFO - trainer - Gradient Accumulation steps = 1
06/22/2020 22:16:50 - INFO - trainer - Total optimization steps = 4090
06/22/2020 22:16:50 - INFO - trainer - Logging steps = 200
06/22/2020 22:16:50 - INFO - trainer - Save steps = 200
Epoch: 0%| | 0/10 [00:00<?, ?it/s/pytorch/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of add_ is deprecated: | 0/409 [00:00<?, ?it/s]
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha)
06/22/2020 22:21:18 - INFO - trainer - ***** Running evaluation on dev dataset ***** | 199/409 [04:27<05:33, 1.59s/it]
06/22/2020 22:21:18 - INFO - trainer - Num examples = 700
06/22/2020 22:21:18 - INFO - trainer - Batch size = 64
Evaluating: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:09<00:00, 1.14it/s]
Iteration: 49%|██████████████████████████████████████████████████████████████▊ | 199/409 [04:38<04:53, 1.40s/it]
Epoch: 0%| | 0/10 [04:38<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 72, in
main(args)
File "main.py", line 20, in main
trainer.train()
File "/home/msdfamily/projects/project-bert/JointBERT/trainer.py", line 105, in train
self.evaluate("dev")
File "/home/msdfamily/projects/project-bert/JointBERT/trainer.py", line 209, in evaluate
out_slot_label_list[i].append(slot_label_map[out_slot_labels_ids[i][j]])
KeyError: 32

Would you be able to help?

Can anyone give an example how to run predict.py?

Sorry I really don't know how to run it :-(
Could anyone give a hand?

Getting exception model doesn't exist

Hi Team,

While training JoinBERT on my dataset using command:

!python main.py --task ss
--model_dir ss_model
--do_train --do_eval
--use_crf

I am getting exception:

raise Exception("Model doesn't exists! Train first!")
Exception: Model doesn't exists! Train first!

To resolve this issue, I made a change in trainer.py wherein I commented

if self.args.save_steps > 0 and global_step % self.args.save_steps == 0:

and replaced it with only

if self.args.save_steps > 0: #dip

Have I done right? hoping for reply
.

Thanks

Potential pull request for model deployment

I implemented the code for model deployment with torchserve
https://github.com/ZeweiChu/JointBERT/tree/master/torchserve

Should I send a pull request to this repo?

How to get confidence for each token in prediction

first of all good job on this repo @monologg !
I facing difficulty in getting confidence of each predicted entity. I am able to get confidence for each intent. Not able to figure out how to parse confidence of each predicted entity.
Can you help here?

Maybe a bug, when a model with crf predicting

the lines of output is not equals to the lines of input.
i check the predict.py, and i think the 188 line should be removed.

How to calculate confidence score for intent and slots while predicting

is there a way to calculate prediction/confidence score for intent and slot identification

i try to use intent_logits and slot_logits but i didnt succeed

can you help me here

Thanks

IndexError: Target 10 is out of bounds.

I am testing JointBERT for my own dataset. I face the below issue.

Can someone help me in resolving this? What might be the reason for the error?

Traceback (most recent call last):
  File "main.py", line 72, in <module>
    main(args)
  File "main.py", line 20, in main
    trainer.train()
  File "/home/ubuntu/JointBERT/trainer.py", line 87, in train
    outputs = self.model(**inputs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/JointBERT/model/modeling_jointbert.py", line 44, in forward
    intent_loss = intent_loss_fct(intent_logits.view(-1, self.num_intent_labels), intent_label_ids.view(-1))
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 916, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 2021, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1838, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 10 is out of bounds.

how to handle multiword slot

Hi team, I have trained JointBERT on my dataset. I haveone doubt is odel handling multiwords slot values . such as deprt.city = los angles or depart.time = 7:00 am ? how to deal to create slot type for each word separated by space. please help