Coder Social home page Coder Social logo

Please help me? about pytorch_neural_crf HOT 7 CLOSED

allanj avatar allanj commented on July 29, 2024
Please help me?

from pytorch_neural_crf.

Comments (7)

allanj avatar allanj commented on July 29, 2024
  1. {YOUR_OTHER_ARGUMENTS} can be left empty. Or you can refer to all the arguments here: https://github.com/allanj/pytorch_neural_crf/blob/master/transformers_trainer.py#L29-L61
  2. Please try to pull the latest version. It is fixed now.

from pytorch_neural_crf.

Deerzh avatar Deerzh commented on July 29, 2024

I update the code,but errors still exist.
Error1. when I run this command:python trainer.py --embedder_type=bert-large-cased
error like this :
usage: trainer.py [-h] [--device {cpu,cuda:0,cuda:1,cuda:2}] [--seed SEED]
[--dataset DATASET] [--embedding_file EMBEDDING_FILE]
[--embedding_dim EMBEDDING_DIM] [--optimizer OPTIMIZER]
[--learning_rate LEARNING_RATE] [--l2 L2]
[--lr_decay LR_DECAY] [--batch_size BATCH_SIZE]
[--num_epochs NUM_EPOCHS] [--train_num TRAIN_NUM]
[--dev_num DEV_NUM] [--test_num TEST_NUM]
[--max_no_incre MAX_NO_INCRE] [--model_folder MODEL_FOLDER]
[--hidden_dim HIDDEN_DIM] [--dropout DROPOUT]
[--use_char_rnn {0,1}] [--static_context_emb {none,elmo}]
[--add_iobes_constraint {0,1}]
trainer.py: error: unrecognized arguments: --embedder_type=bert-large-cased

Error: if I left this {YOUR_OTHER_ARGUMENTS} empty, error still occurred :
Traceback (most recent call last):
File "transformers_trainer_ddp.py", line 22, in
import datasets
ModuleNotFoundError: No module named 'datasets'
Traceback (most recent call last):
File "/home/zhang/anaconda3/envs/neural/bin/accelerate", line 8, in
sys.exit(main())
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/zhang/anaconda3/envs/neural/bin/python', 'transformers_trainer_ddp.py', '--batch_size=30']' returned non-zero exit status 1.

from pytorch_neural_crf.

allanj avatar allanj commented on July 29, 2024

Following the README, you should run transformer_trainer rather than trainer.py

from pytorch_neural_crf.

allanj avatar allanj commented on July 29, 2024

For the second one..
you need to

pip install datasets

I just updated the README to include that. Thanks

from pytorch_neural_crf.

Deerzh avatar Deerzh commented on July 29, 2024

Following the README, you should run transformer_trainer rather than trainer.py

Thank you for your reply,but there are still have some questions about this.
Q1:Is that I firstly run transformer_trainer and secondly run trainer.py or just run transformer_trainer.py? I don't understand your meaning.
Because if I run trainer.py command with '--embedder_type=bert-large-cased' argument,it will raise an error,however if I run trainer.py without arguments, it will be successfully?

Q2 : I have pip install datasets.but when I run accelerate launch transformers_trainer_ddp.py --batch_size=30, error still occurred,like this:
The following values were not passed to accelerate launch and had defaults used instead:
--num_processes was set to a value of 1
--num_machines was set to a value of 1
--mixed_precision was set to a value of 'no'
--num_cpu_threads_per_process was set to 52 to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
09/02/2022 16:16:35 - INFO - main - seed: 42
09/02/2022 16:16:35 - INFO - main - dataset: conll2003
09/02/2022 16:16:35 - INFO - main - optimizer: adamw
09/02/2022 16:16:35 - INFO - main - learning_rate: 2e-05
09/02/2022 16:16:35 - INFO - main - momentum: 0.0
09/02/2022 16:16:35 - INFO - main - l2: 1e-08
09/02/2022 16:16:35 - INFO - main - lr_decay: 0
09/02/2022 16:16:35 - INFO - main - batch_size: 30
09/02/2022 16:16:35 - INFO - main - num_epochs: 1
09/02/2022 16:16:35 - INFO - main - train_num: -1
09/02/2022 16:16:35 - INFO - main - dev_num: -1
09/02/2022 16:16:35 - INFO - main - test_num: -1
09/02/2022 16:16:35 - INFO - main - max_no_incre: 80
09/02/2022 16:16:35 - INFO - main - max_grad_norm: 1.0
09/02/2022 16:16:35 - INFO - main - fp16: 1
09/02/2022 16:16:35 - INFO - main - model_folder: english_model
09/02/2022 16:16:35 - INFO - main - hidden_dim: 0
09/02/2022 16:16:35 - INFO - main - dropout: 0.5
09/02/2022 16:16:35 - INFO - main - embedder_type: roberta-base
09/02/2022 16:16:35 - INFO - main - add_iobes_constraint: 0
09/02/2022 16:16:35 - INFO - main - print_detail_f1: 0
09/02/2022 16:16:35 - INFO - main - earlystop_atr: micro
09/02/2022 16:16:35 - INFO - main - mode: train
09/02/2022 16:16:35 - INFO - main - test_file: data/conll2003/test.txt
Downloading builder script: 6.33kB [00:00, 2.49MB/s]
09/02/2022 16:16:45 - INFO - main - [Data Info] Tokenizing the instances using 'roberta-base' tokenizer
09/02/2022 16:16:55 - INFO - main - [Data Info] Reading dataset from:
data/conll2003/train.txt
data/conll2003/dev.txt
data/conll2003/test.txt
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/train.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|███████████████████████████████████████████████████████| 300/300 [00:00<00:00, 855980.41it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 14
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Using the training set to build label index
09/02/2022 16:16:55 - INFO - src.data.data_utils - #labels: 16
09/02/2022 16:16:55 - INFO - src.data.data_utils - label 2idx: {'': 0, 'O': 1, 'S-ORG': 2, 'S-MISC': 3, 'B-PER': 4, 'E-PER': 5, 'S-LOC': 6, 'B-ORG': 7, 'E-ORG': 8, 'I-PER': 9, 'S-PER': 10, 'B-MISC': 11, 'I-MISC': 12, 'E-MISC': 13, '': 14, '': 15}
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/dev.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|█████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 213995.10it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 2
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/test.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|███████████████████████████████████████████████████| 50350/50350 [00:00<00:00, 895523.33it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 3684
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
Traceback (most recent call last):
File "transformers_trainer_ddp.py", line 284, in
main()
File "transformers_trainer_ddp.py", line 252, in main
test_dataset = TransformersNERDataset(conf.test_file, tokenizer, number=conf.test_num, label2idx=train_dataset.label2idx, is_train=False)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 94, in init
self.insts_ids = convert_instances_to_feature_tensors(insts, tokenizer, label2idx)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 53, in convert_instances_to_feature_tensors
label_ids = [label2idx[label] for label in labels] if labels else [-100] * len(words)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 53, in
label_ids = [label2idx[label] for label in labels] if labels else [-100] * len(words)
KeyError: 'B-LOC'
Traceback (most recent call last):
File "/home/zhang/anaconda3/envs/neural/bin/accelerate", line 8, in
sys.exit(main())
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/zhang/anaconda3/envs/neural/bin/python', 'transformers_trainer_ddp.py', '--batch_size=30']' returned non-zero exit status 1.

from pytorch_neural_crf.

allanj avatar allanj commented on July 29, 2024

You have a Label 'B-LOC' that does not exist in your training set

from pytorch_neural_crf.

allanj avatar allanj commented on July 29, 2024

feel free to reopen the issue

from pytorch_neural_crf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.