jidasheng / bi-lstm-crf Goto Github PK
View Code? Open in Web Editor NEWA PyTorch implementation of the BI-LSTM-CRF model.
License: MIT License
A PyTorch implementation of the BI-LSTM-CRF model.
License: MIT License
Hello, it's me again.
I trained my model and I want to calculate its performance using the F-Score metric. How could I calculate it using your project? Thanks in advance. :)
After completion of my model training, i wanna test the result and got problem of that.
my code:
from bi_lstm_crf.app import WordsTagger
model_dir= './model'
model = WordsTagger(model_dir=model_dir)
sentence='國家外匯管理局公佈,截至2019年11月末,中國外匯儲備規模為30,956億美元'
tags, sequences = model([sentence]) # CHAR-based model
print(tags)
print(sequences)
The result :
Traceback (most recent call last):
File "/Users/marcusau/PycharmProjects/jidasheng/test.py", line 10, in <module>
tags, sequences = model([sentence]) # CHAR-based model
ValueError: not enough values to unpack (expected 2, got 1)
Note: i use char-based model training
for example.
jieba pos tags: 'nr'='人名',=, 'nt'='機構名' ,'w'=符號
Interesting. If I change the tags, can I use it for finding technical attributes?
Also, can it predict unseen words and sentences? Or does it only predict the words in its vocabulary?
I see you have said that "chars/words that not in the vocabulary will be replaced by UNKNOWN". But does that only apply for training or for predicting too?
Very good and user friendly.
I train my model on a non-GPU PC and the training time of 660000 sentences corpus is less than 30 mins.
Very amazing.
My only concern is about accuracy because 20 epochs is not enough for my corpus ,maybe due to huge size.
The loss , val_loss are all about 50. sth.
I am now increasing the epochs from 20 to 100 and see the result.
Thanks for your nice work
example:
in my dataset.txt , the first row is :
作者根據和比利案件有關的醫院人員、律師、警員提供的第一手資料,和比利偷偷寫下的內心筆記,揭露了醫院內對待精神病人的黑幕,和比利既要面對人格融合,及醫院內精神及肉體的不人道對待的矛盾。 ["B","E","B","E","S","B","E","B","E","B","E","S","B","E","B","E","S","B","E","S","B","E","B","E","S","B","M","M","M","E","S","S","B","E","B","E","B","E","S","B","E","B","E","S","B","E","S","B","E","S","B","E","B","M","M","E","S","B","E","S","S","B","E","B","E","B","E","B","E","B","E","S","S","B","E","S","B","E","S","B","E","S","B","M","E","B","E","S","B","E","S"]
the first part (sentence) is string
the second part (BMES label ) is list
config ./data/vocab.json loaded
config ./data/tags.json loaded
tag dict file => ./model/tags.json
tag dict file => ./model/vocab.json
parsing ./data/dataset.txt: 37271it [00:01, 29734.05it/s]
Traceback (most recent call last):
File "/Users/marcusau/jidasheng/lib/python3.6/site- packages/bi_lstm_crf/app/preprocessing/preprocess.py", line 123, in __build_corpus
sentence = json.loads(sentence)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/marcusau/jidasheng/lib/python3.6/site-packages/bi_lstm_crf/__main__.py", line 3, in <module>
main()
File "/Users/marcusau/jidasheng/lib/python3.6/site-packages/bi_lstm_crf/app/train.py", line 119, in main
train(args)
File "/Users/marcusau/jidasheng/lib/python3.6/site-packages/bi_lstm_crf/app/train.py", line 46, in train
args.corpus_dir, args.val_split, args.test_split, max_seq_len=args.max_seq_len)
File "/Users/marcusau/jidasheng/lib/python3.6/site-packages/bi_lstm_crf/app/preprocessing/preprocess.py", line 69, in load_dataset
xs, ys = self.__build_corpus(corpus_dir, max_seq_len)
File "/Users/marcusau/jidasheng/lib/python3.6/site-packages/bi_lstm_crf/app/preprocessing/preprocess.py", line 131, in __build_corpus
raise ValueError("exception raised when parsing line {}\n {}".format(idx + 1, e))
ValueError: exception raised when parsing line 37514
Expecting value: line 1 column 2 (char 1)
Hi @jidasheng
I have been trying to run your implementation of the BiLSTM-CRF a couple of times but I keep getting an error.
On my terminal I executed
>$ python -m bi_lstm_crf corpus_dir --model_dir "model xxx"
and got the following error (See attached image below for full details)
ValueError: "corpus_dir/vocab.json" file does not exist
I've also tried >$ python3 -m bi_lstm_crf corpus_dir --model_dir "model xxx" but the same error persists.
I have not modified anything in the code and when I check inside "sample_corpus" the file "vocab.json" is there.
I am new to BiLSTM-CRF code but I have read the paper [2] and a few others on NER using BiLSTM-CRF. Is there something wrong that I'm doing? Please advise. Thanks!
When launching the WordTagger with device = 'cpu' the class throws an error:
WordsTagger( basepath, device='cpu')
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\models.py", line 613, in __init__
self.tagger = WordsTagger(
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\bi_lstm_crf\app\predict.py", line 15, in __init__
self.model = build_model(self.args, self.preprocessor, load=True, verbose=False)
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\bi_lstm_crf\app\utils.py", line 24, in build_model
state_dict = torch.load(model_path)
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\torch\serialization.py", line 789, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\torch\serialization.py", line 1131, in _load
result = unpickler.load()
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\torch\serialization.py", line 1101, in persistent_load
load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\torch\serialization.py", line 1083, in load_tensor
wrap_storage=restore_location(storage, location),
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\torch\serialization.py", line 215, in default_restore_location
result = fn(storage, location)
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize
device = validate_cuda_device(location)
File "C:\Users\MarcoOdore\agilelab\MultiLegalSBD-master\venv\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
The reason is because under app.utils.py, the method build_model don't take in account the device passed as input
class WordsTagger:
def __init__(self, model_dir, device=None):
args_ = load_json_file(arguments_filepath(model_dir))
args = argparse.Namespace(**args_)
args.model_dir = model_dir
self.args = args
self.preprocessor = Preprocessor(config_dir=model_dir, verbose=False)
self.model = build_model(self.args, self.preprocessor, load=True, verbose=False) # here
self.device = running_device(device)
self.model.to(self.device)
self.model.eval()
def build_model(args, processor, load=True, verbose=False):
model = BiRnnCrf(len(processor.vocab), len(processor.tags),
embedding_dim=args.embedding_dim, hidden_dim=args.hidden_dim, num_rnn_layers=args.num_rnn_layers)
# weights
model_path = model_filepath(args.model_dir)
if exists(model_path) and load:
state_dict = torch.load(model_path) # here
model.load_state_dict(state_dict)
if verbose:
print("load model weights from {}".format(model_path))
return model
I think that the problem could be solved by passing the device also to the build_model method, changing the torch.load method, adding the desired device
def build_model(args, processor, load=True, verbose=False, device='gpu'):
model = BiRnnCrf(len(processor.vocab), len(processor.tags),
embedding_dim=args.embedding_dim, hidden_dim=args.hidden_dim, num_rnn_layers=args.num_rnn_layers)
# weights
model_path = model_filepath(args.model_dir)
if exists(model_path) and load:
if device == 'cpu':
state_dict = torch.load(model_path, map_location=torch.device('cpu'))
else:
state_dict = torch.load(model_path)
model.load_state_dict(state_dict)
if verbose:
print("load model weights from {}".format(model_path))
return model
您好,我在尝试一次预测多个句子的时候报错了,代码如下:
tags, sequences = model(["句子1","句子2",...])
请问可以支持这种调用方式吗,会不会快一点呢?
您好,请问您用这个代码跑过数据集吗?我在微软公开数据集上F1只能到60%+,怎么都提升不上来。
model.h5?
or
model.model
or
model.pkl??
Thanks for your fantastic piece of work.
May I ask if this library can be run on cpu computer?
I would like to train my model on my office desktop which has no gpu.
Thanks a lot.
Hello,
My code is getting the error of 'list index out of range' when I try to predict a sentence from the model. A analyzed your code but couldn't figure out what could it be.
Here is my code:
model = WordsTagger(model_dir='name_model')
tags, sequences = model([["meu", "amigo", "e", "senhor", "."]])
print(tags)
print(sequences)
Here is the log:
`Traceback (most recent call last):
File "C:\Users\guilh\OneDrive\Textos e Documentação\workspace\BI-LSTM-CRF Cartas\bilstmprocessor.py", line 30, in <module>
predict_text()
File "C:\Users\guilh\OneDrive\Textos e Documentação\workspace\BI-LSTM-CRF Cartas\bilstmprocessor.py", line 26, in predict_text
tags, sequences = model([["meu", "amigo", "e", "senhor", "."]])
File "C:\Users\guilh\anaconda3\envs\bilstm_cartas\lib\site-packages\bi_lstm_crf\app\predict.py", line 40, in __call__
return tags, self.tokens_from_tags(sentences, tags, begin_tags=begin_tags)
File "C:\Users\guilh\anaconda3\envs\bilstm_cartas\lib\site-packages\bi_lstm_crf\app\predict.py", line 64, in tokens_from_tags
tokens_list = [_tokens(sentence, ts) for sentence, ts in zip(sentences, tags_list)]
File "C:\Users\guilh\anaconda3\envs\bilstm_cartas\lib\site-packages\bi_lstm_crf\app\predict.py", line 64, in <listcomp>
tokens_list = [_tokens(sentence, ts) for sentence, ts in zip(sentences, tags_list)]
File "C:\Users\guilh\anaconda3\envs\bilstm_cartas\lib\site-packages\bi_lstm_crf\app\predict.py", line 56, in _tokens
begins = [b for idx, b in enumerate(begins) if idx == 0 or ts[idx] != "O" or ts[idx - 1] != "O"]
File "C:\Users\guilh\anaconda3\envs\bilstm_cartas\lib\site-packages\bi_lstm_crf\app\predict.py", line 56, in <listcomp>
begins = [b for idx, b in enumerate(begins) if idx == 0 or ts[idx] != "O" or ts[idx - 1] != "O"]
IndexError: list index out of range`
I'd appreciate any help. Thanks in advance. Best regards.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.