Coder Social home page Coder Social logo

monologg / jointbert Goto Github PK

View Code? Open in Web Editor NEW
637.0 13.0 183.0 475 KB

Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"

License: Apache License 2.0

Python 100.00%
bert transformers slot-filling pytorch intent-classification slu joint-bert

jointbert's Introduction

๐Ÿš€ Things I do

  • NLP Engineer, contributing on Korean NLP with Open Source!

๐Ÿ“ฌ Find me at

Linkedin Badge Gmail Badge Tech Blog Badge

jointbert's People

Contributors

monologg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jointbert's Issues

ๅฏนไบŽ่‹ฑๆ–‡ไธญไธ€ไธช่ฏ่ขซๅˆ’ๅˆ†ๆˆๅคšไธช้ƒจๅˆ†๏ผŒไฝฟๅพ—ไธŽๆ ‡็ญพๅบๅˆ—ๅฏนๅบ”ไธไธŠ็š„้—ฎ้ข˜ๅฆ‚ไฝ•่งฃๅ†ณ๏ผŸ

ๅฏนไบŽๅบๅˆ—ๆ ‡ๆณจไปปๅŠก๏ผŒไธ€ไธชๅ•่ฏ่ขซๅˆ’ๅˆ†ไธบๅคšไธช้ƒจๅˆ†๏ผŒๅฏผ่‡ด่พ“ๅ…ฅ็š„tokenๅบๅˆ—้•ฟไบŽๆ ‡็ญพๅบๅˆ—๏ผŒๅฆ‚ไฝ•่งฃๅ†ณ

reslut on snips

Hello!!! I just can't replicate your reslut on snips when using bert + CRF, it's the baseline bert-base-uncased or any tricks when implement training?

"JointBERT" vs. "BertForMaskedLM" in model config json file

Hi,

This is a great repo! I was using your code to train Bert on my own data, but I noticed that the architecture field in the model config json file changed. The code and data are exactly the same, but the architecture changed which is wired.

Please see the attached screenshots.
Do you know why this changed over time?
Thanks!

Screen Shot 2020-02-17 at 5 16 17 PM
Screen Shot 2020-02-17 at 5 16 09 PM

Learning objective

Hi @monologg! Just a theoretical question about what the BERT for Joint Intent Classification and Slot Filling publication says here:

The learning objective is to maximize the conditional probability p(y^i, y^s|x). The model is finetuned end-to-end via minimizing the cross-entropy loss.

If I understand correctly, this is not to sum the intent and slot losses as you have in your models (total_loss = intent_loss + self.args.slot_loss_coef * slot_loss). If that part of the paper is correct, you should first multiply the probabilities calculated from both logits and then use the CrossEntropyLoss over these probabilities.

issue in training

looks like i m able to train but model is not being save and not able to perform evaluation.
training comand:

python3 main.py --task atis   --model_type albert       --model_dir atis_out  --do_train --do_eval

model_dir is empty after training and hence it cant find model in it.

03/20/2020 21:35:14 - INFO - trainer -   ***** Running training *****
03/20/2020 21:35:14 - INFO - trainer -     Num examples = 4478
03/20/2020 21:35:14 - INFO - trainer -     Num Epochs = 1
03/20/2020 21:35:14 - INFO - trainer -     Total train batch size = 64
03/20/2020 21:35:14 - INFO - trainer -     Gradient Accumulation steps = 1
03/20/2020 21:35:14 - INFO - trainer -     Total optimization steps = 70
Iteration: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 70/70 [00:51<00:00,  1.36it/s]
Epoch: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1/1 [00:51<00:00, 51.52s/it]
Traceback (most recent call last):
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 221, in get_config_dict
    resume_download=resume_download,
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/file_utils.py", line 245, in cached_path
    raise EnvironmentError("file {} not found".format(url_or_filename))
OSError: file atis_out/config.json not found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gamut/Downloads/JointBERT-master/trainer.py", line 228, in load_model
    self.bert_config = self.config_class.from_pretrained(self.args.model_dir)
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 176, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 241, in get_config_dict
    raise EnvironmentError(msg)
OSError: Model name 'atis_out' was not found in model name list. We assumed 'atis_out/config.json' was a path, a model identifier, or url to a configuration file named config.json or a directory containing such a file but couldn't find any such file at this path or url.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 81, in <module>
    main(args)
  File "main.py", line 22, in main
    trainer.load_model()
  File "/home/gamut/Downloads/JointBERT-master/trainer.py", line 236, in load_model
    raise Exception("Some model files might be missing...")
Exception: Some model files might be missing...

Target out of bounds

I have the number of intents as 3. And now I am getting the error:

ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 12 is out of bounds.

any help is appreciated

how calculate sent๏ผŸ

i hard try, but i stilll don't how to calculate. my acc is low,,,,,bad mood....
hope you help me, thanks

Exception: Some model files might be missing...

Hi
I got this error when training this model for ATIS : python3 main.py --task atis --num_train_epochs 10 --model_type bert --model_dir atis_model --do_train --do_eval --use_crf

do you have an idea how can I solve this problem please ?

INFO - transformers.modeling_utils - loading weights file atis_model/pytorch_model.bin
Traceback (most recent call last):
File "tools/git/JointBERT/trainer.py", line 233, in load_model
self.model = self.model_class.from_pretrained(self.args.model_dir)
File "miniconda2/envs/condpy36/lib/python3.6/site-packages/transformers/modeling_utils.py", line 512, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
TypeError: init() missing 3 required positional arguments: 'args', 'intent_label_lst', and 'slot_label_lst'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 71, in
main(args)
File "main.py", line 22, in main
trainer.load_model()
File "tools/git/JointBERT/trainer.py", line 237, in load_model
raise Exception("Some model files might be missing...")
Exception: Some model files might be missing...
Exception: Some model files might be missing...

Is something wrong with loss?

I don't know if you've ever encountered this issue in your training, the loss increased but all other metrics decreased. It's weird... I don't know if it affects convergence.
image

KeyError during training

Hi, thanks for sharing this code repository! I just have a slight issue with KeyErrors during model training using custom data:

06/22/2020 22:16:45 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/msdfamily/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_train_snips_bert-base-uncased_50
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_dev_snips_bert-base-uncased_50
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_test_snips_bert-base-uncased_50
06/22/2020 22:16:46 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/msdfamily/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
06/22/2020 22:16:46 - INFO - transformers.configuration_utils - Model config BertConfig {
"architectures": [
"BertForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"finetuning_task": "snips",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 30522
}

06/22/2020 22:16:46 - INFO - transformers.modeling_utils - loading weights file https://cdn.huggingface.co/bert-base-uncased-pytorch_model.bin from cache at /home/msdfamily/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
06/22/2020 22:16:48 - INFO - transformers.modeling_utils - Weights of JointBERT not initialized from pretrained model: ['intent_classifier.linear.weight', 'intent_classifier.linear.bias', 'slot_classifier.linear.weight', 'slot_classifier.linear.bias']
06/22/2020 22:16:48 - INFO - transformers.modeling_utils - Weights from pretrained model not used in JointBERT: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
06/22/2020 22:16:50 - INFO - trainer - ***** Running training *****
06/22/2020 22:16:50 - INFO - trainer - Num examples = 13084
06/22/2020 22:16:50 - INFO - trainer - Num Epochs = 10
06/22/2020 22:16:50 - INFO - trainer - Total train batch size = 32
06/22/2020 22:16:50 - INFO - trainer - Gradient Accumulation steps = 1
06/22/2020 22:16:50 - INFO - trainer - Total optimization steps = 4090
06/22/2020 22:16:50 - INFO - trainer - Logging steps = 200
06/22/2020 22:16:50 - INFO - trainer - Save steps = 200
Epoch: 0%| | 0/10 [00:00<?, ?it/s/pytorch/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of add_ is deprecated: | 0/409 [00:00<?, ?it/s]
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha)
06/22/2020 22:21:18 - INFO - trainer - ***** Running evaluation on dev dataset ***** | 199/409 [04:27<05:33, 1.59s/it]
06/22/2020 22:21:18 - INFO - trainer - Num examples = 700
06/22/2020 22:21:18 - INFO - trainer - Batch size = 64
Evaluating: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 11/11 [00:09<00:00, 1.14it/s]
Iteration: 49%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š | 199/409 [04:38<04:53, 1.40s/it]
Epoch: 0%| | 0/10 [04:38<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 72, in
main(args)
File "main.py", line 20, in main
trainer.train()
File "/home/msdfamily/projects/project-bert/JointBERT/trainer.py", line 105, in train
self.evaluate("dev")
File "/home/msdfamily/projects/project-bert/JointBERT/trainer.py", line 209, in evaluate
out_slot_label_list[i].append(slot_label_map[out_slot_labels_ids[i][j]])
KeyError: 32

Would you be able to help?

Getting exception model doesn't exist

Hi Team,

While training JoinBERT on my dataset using command:

!python main.py --task ss
--model_dir ss_model
--do_train --do_eval
--use_crf

I am getting exception:

raise Exception("Model doesn't exists! Train first!")
Exception: Model doesn't exists! Train first!

To resolve this issue, I made a change in trainer.py wherein I commented

if self.args.save_steps > 0 and global_step % self.args.save_steps == 0:

and replaced it with only

if self.args.save_steps > 0: #dip

Have I done right? hoping for reply
.

Thanks

How to get confidence for each token in prediction

first of all good job on this repo @monologg !
I facing difficulty in getting confidence of each predicted entity. I am able to get confidence for each intent. Not able to figure out how to parse confidence of each predicted entity.
Can you help here?

IndexError: Target 10 is out of bounds.

I am testing JointBERT for my own dataset. I face the below issue.

Can someone help me in resolving this? What might be the reason for the error?

Traceback (most recent call last):
  File "main.py", line 72, in <module>
    main(args)
  File "main.py", line 20, in main
    trainer.train()
  File "/home/ubuntu/JointBERT/trainer.py", line 87, in train
    outputs = self.model(**inputs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/JointBERT/model/modeling_jointbert.py", line 44, in forward
    intent_loss = intent_loss_fct(intent_logits.view(-1, self.num_intent_labels), intent_label_ids.view(-1))
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 916, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 2021, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1838, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 10 is out of bounds.

how to handle multiword slot

Hi team, I have trained JointBERT on my dataset. I haveone doubt is odel handling multiwords slot values . such as deprt.city = los angles or depart.time = 7:00 am ? how to deal to create slot type for each word separated by space. please help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.