Comments (7)
{YOUR_OTHER_ARGUMENTS}
can be left empty. Or you can refer to all the arguments here: https://github.com/allanj/pytorch_neural_crf/blob/master/transformers_trainer.py#L29-L61- Please try to pull the latest version. It is fixed now.
from pytorch_neural_crf.
I update the code,but errors still exist.
Error1. when I run this command:python trainer.py --embedder_type=bert-large-cased
error like this :
usage: trainer.py [-h] [--device {cpu,cuda:0,cuda:1,cuda:2}] [--seed SEED]
[--dataset DATASET] [--embedding_file EMBEDDING_FILE]
[--embedding_dim EMBEDDING_DIM] [--optimizer OPTIMIZER]
[--learning_rate LEARNING_RATE] [--l2 L2]
[--lr_decay LR_DECAY] [--batch_size BATCH_SIZE]
[--num_epochs NUM_EPOCHS] [--train_num TRAIN_NUM]
[--dev_num DEV_NUM] [--test_num TEST_NUM]
[--max_no_incre MAX_NO_INCRE] [--model_folder MODEL_FOLDER]
[--hidden_dim HIDDEN_DIM] [--dropout DROPOUT]
[--use_char_rnn {0,1}] [--static_context_emb {none,elmo}]
[--add_iobes_constraint {0,1}]
trainer.py: error: unrecognized arguments: --embedder_type=bert-large-cased
Error: if I left this {YOUR_OTHER_ARGUMENTS} empty, error still occurred :
Traceback (most recent call last):
File "transformers_trainer_ddp.py", line 22, in
import datasets
ModuleNotFoundError: No module named 'datasets'
Traceback (most recent call last):
File "/home/zhang/anaconda3/envs/neural/bin/accelerate", line 8, in
sys.exit(main())
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/zhang/anaconda3/envs/neural/bin/python', 'transformers_trainer_ddp.py', '--batch_size=30']' returned non-zero exit status 1.
from pytorch_neural_crf.
Following the README, you should run transformer_trainer
rather than trainer.py
from pytorch_neural_crf.
For the second one..
you need to
pip install datasets
I just updated the README to include that. Thanks
from pytorch_neural_crf.
Following the README, you should run
transformer_trainer
rather thantrainer.py
Thank you for your reply,but there are still have some questions about this.
Q1:Is that I firstly run transformer_trainer and secondly run trainer.py or just run transformer_trainer.py? I don't understand your meaning.
Because if I run trainer.py command with '--embedder_type=bert-large-cased' argument,it will raise an error,however if I run trainer.py without arguments, it will be successfully?
Q2 : I have pip install datasets.but when I run accelerate launch transformers_trainer_ddp.py --batch_size=30, error still occurred,like this:
The following values were not passed to accelerate launch
and had defaults used instead:
--num_processes
was set to a value of 1
--num_machines
was set to a value of 1
--mixed_precision
was set to a value of 'no'
--num_cpu_threads_per_process
was set to 52
to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run accelerate config
.
09/02/2022 16:16:35 - INFO - main - seed: 42
09/02/2022 16:16:35 - INFO - main - dataset: conll2003
09/02/2022 16:16:35 - INFO - main - optimizer: adamw
09/02/2022 16:16:35 - INFO - main - learning_rate: 2e-05
09/02/2022 16:16:35 - INFO - main - momentum: 0.0
09/02/2022 16:16:35 - INFO - main - l2: 1e-08
09/02/2022 16:16:35 - INFO - main - lr_decay: 0
09/02/2022 16:16:35 - INFO - main - batch_size: 30
09/02/2022 16:16:35 - INFO - main - num_epochs: 1
09/02/2022 16:16:35 - INFO - main - train_num: -1
09/02/2022 16:16:35 - INFO - main - dev_num: -1
09/02/2022 16:16:35 - INFO - main - test_num: -1
09/02/2022 16:16:35 - INFO - main - max_no_incre: 80
09/02/2022 16:16:35 - INFO - main - max_grad_norm: 1.0
09/02/2022 16:16:35 - INFO - main - fp16: 1
09/02/2022 16:16:35 - INFO - main - model_folder: english_model
09/02/2022 16:16:35 - INFO - main - hidden_dim: 0
09/02/2022 16:16:35 - INFO - main - dropout: 0.5
09/02/2022 16:16:35 - INFO - main - embedder_type: roberta-base
09/02/2022 16:16:35 - INFO - main - add_iobes_constraint: 0
09/02/2022 16:16:35 - INFO - main - print_detail_f1: 0
09/02/2022 16:16:35 - INFO - main - earlystop_atr: micro
09/02/2022 16:16:35 - INFO - main - mode: train
09/02/2022 16:16:35 - INFO - main - test_file: data/conll2003/test.txt
Downloading builder script: 6.33kB [00:00, 2.49MB/s]
09/02/2022 16:16:45 - INFO - main - [Data Info] Tokenizing the instances using 'roberta-base' tokenizer
09/02/2022 16:16:55 - INFO - main - [Data Info] Reading dataset from:
data/conll2003/train.txt
data/conll2003/dev.txt
data/conll2003/test.txt
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/train.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|███████████████████████████████████████████████████████| 300/300 [00:00<00:00, 855980.41it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 14
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Using the training set to build label index
09/02/2022 16:16:55 - INFO - src.data.data_utils - #labels: 16
09/02/2022 16:16:55 - INFO - src.data.data_utils - label 2idx: {'': 0, 'O': 1, 'S-ORG': 2, 'S-MISC': 3, 'B-PER': 4, 'E-PER': 5, 'S-LOC': 6, 'B-ORG': 7, 'E-ORG': 8, 'I-PER': 9, 'S-PER': 10, 'B-MISC': 11, 'I-MISC': 12, 'E-MISC': 13, '': 14, '': 15}
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/dev.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|█████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 213995.10it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 2
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/test.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|███████████████████████████████████████████████████| 50350/50350 [00:00<00:00, 895523.33it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 3684
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
Traceback (most recent call last):
File "transformers_trainer_ddp.py", line 284, in
main()
File "transformers_trainer_ddp.py", line 252, in main
test_dataset = TransformersNERDataset(conf.test_file, tokenizer, number=conf.test_num, label2idx=train_dataset.label2idx, is_train=False)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 94, in init
self.insts_ids = convert_instances_to_feature_tensors(insts, tokenizer, label2idx)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 53, in convert_instances_to_feature_tensors
label_ids = [label2idx[label] for label in labels] if labels else [-100] * len(words)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 53, in
label_ids = [label2idx[label] for label in labels] if labels else [-100] * len(words)
KeyError: 'B-LOC'
Traceback (most recent call last):
File "/home/zhang/anaconda3/envs/neural/bin/accelerate", line 8, in
sys.exit(main())
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/zhang/anaconda3/envs/neural/bin/python', 'transformers_trainer_ddp.py', '--batch_size=30']' returned non-zero exit status 1.
from pytorch_neural_crf.
You have a Label 'B-LOC' that does not exist in your training set
from pytorch_neural_crf.
feel free to reopen the issue
from pytorch_neural_crf.
Related Issues (20)
- AttributeError: 'str' object has no attribute 'size' HOT 2
- Macro F1 and Precision HOT 2
- get error when running in torch 1.81 HOT 4
- ValueError: The label B-MISC does not exist in label2idx dict. The label might not appear in the training set. HOT 1
- How to tweak the learning rate of CRF layer? HOT 1
- Errors when running the default model HOT 7
- Is there a way to use local model checkpoints for Hugginface models ? HOT 3
- About orig_to_token_index padding problem HOT 2
- About CRF layer HOT 3
- Evaluation question HOT 1
- About GLOVE result F1 91.36 HOT 9
- Can not run ner_predictor.py HOT 3
- Summary of training/finetuning/prediction commands HOT 2
- Learning on Heterogeneous Tag Sets using Tag Hierarchy HOT 1
- Use custom pre-trained BERT models HOT 2
- purpose of token_type_ids or segments HOT 2
- load data error HOT 5
- computation of partition function HOT 11
- ImportError: cannot import name 'context_models' from 'src.config' when running transformers_predictor.py HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch_neural_crf.