guillaumegenthial / tf_ner Goto Github PK
View Code? Open in Web Editor NEWSimple and Efficient Tensorflow implementations of NER models with tf.estimator and tf.data
License: Apache License 2.0
Simple and Efficient Tensorflow implementations of NER models with tf.estimator and tf.data
License: Apache License 2.0
Is it possible to use an other word embedding than glove (for other languages)?
runfile('D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf/main.py', wdir='D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf')
Using config: {'_model_dir': 'results/model', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 120, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001698906BD30>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
Traceback (most recent call last):
File "", line 1, in
runfile('D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf/main.py', wdir='D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf')
File "D:\ComputerSoftwares\Anaconda\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 786, in runfile
execfile(filename, namespace)
File "D:\ComputerSoftwares\Anaconda\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf/main.py", line 173, in
estimator, 'f1', 500, min_steps=8000, run_every_secs=120)
File "D:\ComputerSoftwares\Anaconda\lib\site-packages\tensorflow_estimator\contrib\estimator\python\estimator\early_stopping.py", line 266, in stop_if_no_increase_hook
run_every_steps=run_every_steps)
File "D:\ComputerSoftwares\Anaconda\lib\site-packages\tensorflow_estimator\contrib\estimator\python\estimator\early_stopping.py", line 422, in _stop_if_no_metric_improvement_hook
run_every_steps=run_every_steps)
File "D:\ComputerSoftwares\Anaconda\lib\site-packages\tensorflow_estimator\contrib\estimator\python\estimator\early_stopping.py", line 88, in make_early_stopping_hook
'Got: {}'.format(type(estimator)))
TypeError: estimator
must have type tf.estimator.Estimator
. Got: <class 'tensorflow.python.estimator.estimator.Estimator'>
I'm getting the following error when running make build
IsADirectoryError: [Errno 21] Is a directory: 'glove.840B.300d.txt'
make: *** [Makefile:8: build] Error 1
Hi,
Thanks for making a new version with tf.data and I was wondering if your reported performance is evaluated with the entity-level (i.e. span-level) P/R/F1. It looks like you are using the token-level F1, which can be different from the mainstream span-level metric in the papers.
trying to use model on custom data
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Assign requires shapes of both tensors to match. lhs shape= [106,300] rhs shape= [23,300]
[[node save/Assign (defined at /home/anujc/.conda/envs/py37/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py:1403) ]]
can you please guide
Consider the example,
Travel to dummylocation BIO annotation would be O O B_LOC
How to make the neural network to understand that the content inside the is dummy ?
Is it possible ?
Currently, the learnt models require the nwords
parameter, along with the main words
input, when serving the models with tensorflow server.
It would be very helpful if this requirement of nwords
parameter can be omitted, so that only the words array can be streamed directly from database sources to the tensorflow server running the learnt models.
In tf.estimator.ModeKeys.EVAL mode, the F1 is charactor-level. Is it possible to implement the entity-level F1 in eval_metric_ops?
Hi,thank you for your code,but i want to know the evaluation script 'conlleval' can apply to BIO or BIOES or both?Is there a limit to this evaluation script?Thanks.
I am getting this error in main.py while I try to run the program. Has anyone else faced the same issue.
As you said, "Note that the example dataset is here for debugging purposes only and won't be of much use to train an actual model". However, I really need your training data to use the NER model.
Although I can get them from some datasets like CONLL, I think it is helpful (at least time-saving) not just for me to share the datasets which can be used in your code directly.
When estimator. predict() is used, the pred_ids of <pad>
is always zero.
However, if serve.py is used to predict one sample, the pred_ids of <pad>
will be the correct value.
Can you or someone explain this problem? thx
After training a model, how to avoid retraining for the next time?
tf_ner/models/chars_conv_lstm_crf/main.py
Line 96 in c3284f0
The comment says "Char LSTM" whereas it should have been "Char CNN". A minor issue, but a confusing one for newbies.
/Users/me/tf_ner/venv/bin/python /Users/bxr8516/tf_ner/models/lstm_crf/main.py
Using config: {'_model_dir': 'results/model', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 120, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1183a3400>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
Traceback (most recent call last):
File "/Users/me/tf_ner/models/lstm_crf/main.py", line 166, in
Path(estimator.eval_dir()).mkdir(parents=True, exist_ok=True)
AttributeError: 'Estimator' object has no attribute 'eval_dir'
Process finished with exit code 1
I am getting above error , I have installed tensorflow==1.6
What can be the problem?
Did anyone have something like this?
Saving checkpoints for 0 into results/model/model.ckpt.
Traceback (most recent call last): . . . .
InvalidArgumentError (see above for traceback): assertion failed: [`labels` contains negative values] [Condition x >= 0 did not hold element-wise:] [x (Reshape_5:0) = ] [8 12 -1...]
[[{{node confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert}} = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/Switch, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_0, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_1, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_2, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/Switch_1)]]
Thanks for this amazing work!
Recently FLAIR (PyTorch) has release state of the art models for NER in several languages. It would be worth if you will provide here pre-trained models as well!
Thank you
While running this model on the CONLL2003 data ( TF 1.11) , i am getting this error
global_step/sec: 3.00248
loss = 5.6152534, step = 101 (33.300 sec)
global_step/sec: 2.88663
loss = 4.2953496, step = 201 (34.641 sec)
global_step/sec: 3.08292
loss = 3.5392659, step = 301 (32.438 sec)
Saving checkpoints for 344 into results\model\model.ckpt.
Estimator is not trained yet. Will start an evaluation when a checkpoint is ready.
RuntimeError: There was no new checkpoint after the training. Eval status: missing checkpoint
Could you please paste a picture of your model weights directory ?.
thanks
i got the prediction like '[["1", "B_t_h"], ["9", "M_t_h"], ["4", "M_t_h"], ["0", "M_t_h"]]', the tag of number 0 should be 'E_t_h', is there any explation for this problem? and how can i fix the prediction?
and i also got the problem that all of the predictions most likely to predicted as 't_h', is there any good way to adjust it?
looking for your reply, thanks!
Hi, how can i load this model in Android Studio. Always show this error "buffer with XX elements is not compatible with a Tensor with shape[0, 0, 0]"
Hi,
Thanks for sharing this great implementation.
I know it is possible to get the label probabilities using forward backward algorithm in CRFs. I am finding some difficulties in implementing/modifying the default CRF implementation in tensorflow. For calculation of the partition function, they have only used the forward (message passing) algorithm. Do you have any experience or ideas about how the forward-backward algorithm could be implemented in tf?
I created my own vocabulary and tags, I ran the code with this parameters:
{
"batch_size": 2,
"buffer": 15000,
"chars": "vocab.chars.txt",
"dim": 300,
"dim_chars": 100,
"dropout": 0.3,
"epochs": 25,
"filters": 50,
"glove": "glove.npz",
"kernel_size": 3,
"lstm_size": 100,
"num_oov_buckets": 1,
"tags": "vocab.tags.txt",
"words": "vocab.words.txt"
}
After 8 hours of training I got my results:
Saving dict for global step 9534: acc = 0.96229786, f1 = 0.8532131, global_step = 9534, loss = 36.943344, precision = 0.8494137$
Saving 'checkpoint_path' summary for global step 9534: results/model/model.ckpt-9534
But when I tried to predict with the same test dataset, the model is not predicting, for example in test we have Bora B-NAME B-NAME and when I tried to pass the same sentence, the algorithm is not predicting that entity. The problem is the algorithm is saying that the accuracy is 0.962297 and f1 0.8532131, but in reality is detecting only 38%, so the accuracy is not 0.85 is only 0.38.
What could be the problem?
Do you have an idea?
Thank you!
Hi, thanks for great works! I have a question, the code also works for chunking or it needs extra efforts to adapt it for chunking task.
Thanks
I have over 60,000,000 sequences to analysis, when I use this way to predict it will takes more than 1 second per sequence, I want to speed it up, is it possible?
I am trying to run the code with out making any changes and hitting below error from (chars_conv_lstm_crf/main.py)
OutOfRangeError (see above for traceback): End of sequence
[[{{node IteratorGetNext}} = IteratorGetNext_class=["loc:@cond/sub/Switch"], output_shapes=[[?,?], [?], [?,?,?], [?,?], [?,?]], output_types=[DT_STRING, DT_INT32, DT_STRING, DT_INT32, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
TF version : 1.11.0rc2
Do you know why this error appears ?.
thanks
I can't find the solution to the above issue anywhere, not sure what exactly the problem is. Could anyone help me with this
I would be grateful if you could help me with this problem.
I tried to execute model training with example data and received the error:
Traceback (most recent call last):
File "/opt/project/models/lstm_crf/main.py", line 171, in
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 471, in train_and_evaluate
return executor.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 610, in run
return self.run_local()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 711, in run_local
saving_listeners=saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 356, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1211, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1169, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/opt/project/models/lstm_crf/main.py", line 117, in model_fn
'precision': tf.metrics.precision(tags, pred_ids, num_tags, indices, weights),
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/metrics_impl.py", line 2024, in precision
if updates_collections:
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 669, in bool
raise TypeError("Using atf.Tensor
as a Pythonbool
is not allowed. "
TypeError: Using atf.Tensor
as a Pythonbool
is not allowed. Useif t is not None:
instead ofif t:
to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
I'm using Docker with tensorflow:latest-gpu-py3.
I have this conceptual doubt in the part where we are obtaining word level representations from characters using the final output of BiLSTM network. We are initializing the character embeddings using xavier_initialization which just ensures that the cells do not saturate. So, how do these random embeddings capture any meaningful information? And how is this network trained or is it unsupervised?
@guillaumegenthial
training Dataset used was In B-I-O format Begin Inside Out format
but for some of the predictions it shows I-LOC without B-LOC
If I assume correctly, Inside should be always after Begin ... Any thoughts where it is going wrong?
Is there any way to control this transitions
Same problem with sequence_tagging implementation and this one
Any suggestions where to control these transitions ?
tf_ner/models/chars_conv_lstm_crf/main.py
Line 96 in c3284f0
The comment should have been 'Char CNN' and not 'Char LSTM'. Not a big issue, but can be confusing for the newbies.
how to get top-k best candidate sequences when using CRF for decoding in tensorflow?
Hey Guillame, really excellent repo! I came across a minor issue with your code on macOS using Python 3.6.1 and the most recent version of GloVe 840B 300d (as of today).
In build_glove.py
, the line: with Path('glove.840B.300d.txt').open() as f:
implicitly reads in the file as ASCII encoded which apparently doesn't play nice with however my stuff is set up. It can be remedied with the following code:
with open(Path('glove.840B.300d.txt'), 'rb') as f:
for line_idx, line in enumerate(f):
line = line.decode('utf-8')
...
Happy to submit a PR for this or else maybe you can just shove it in at your leisure. Thanks again for all your hard work
If a word is not found in the pre-trained embeddings used for word embeddings, would it have a bad effect on the model performance? I try to generate my embeddings as well, but I feel they don't help the model cluster similar words.
Restoring a saved model using estimator seems messy to me (modes/lstm_crf)
. Can you help me to build a interactive mode NER?
Running training and evaluation locally (non-distributed).
Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 120.
Calling model_fn.
TypeErrorTraceback (most recent call last)
----> 1 tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
TypeError: Using a tf.Tensor as a Python bool is not allowed. Use if t is not None: instead of if t: to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
Did anyone get this error ?
PS : Using TF 1.11
How to train custom entity labels other than PER, LOC, ORG & MISC?
I need entities like "total amount" from a document.
In readme, there is no example from where i can use https://github.com/davidsbatista/NER-datasets in this tf_ner ? can you provide some an example to use any of data set in your tensorflow model
I got this error caused by the argument in the 'RunConfig' function when I run the 'python main.py' command. However, it still showed that 'Estimator' object has no attribute 'eval_dir'. what should I do next? Thank you in advance.
HI Guillaume,
This is an amazing repo - great work!
found a minor error -
estimator.export_saved_model('saved_model', serving_input_receiver_fn) -> estimator.export_savedmodel
Keep up the great work! I'm a fan
If we really want to use export_savedmodel(serving_input_receiver_fn=?, export_dir_base=?)
using custom estimator, (Ignoring the checkpoints) for the production, what it should be? Here's an example of my scripts:
UPDATE:
#2
def serving_input_fn(hyperparameters=None):
feature_spec = {
'foo': tf.placeholder(dtype=tf.string, shape=[None, None]),
'bar': tf.placeholder(tf.int32, [None]),
}
return tf.estimator.export.ServingInputReceiver(receiver_tensors=feature_spec,
features=feature_spec)
And adding more line in model_fn
for converting from dict
to tensor
.
This solution saves the model using estimator.export_savedmodel(export_dir_base='./Result/', serving_input_receiver_fn=serving_input_fn())
.
Why the local softmax formula in your blog: https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html used product, but the code just have normal softmax:
losses = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=scores, labels=labels),
And this sentence aslo confused me: "Eventually, the probability ℙ(y) of a sequence of tag y is the product pt",when softmax will need the product?
I got a terrible result when I use the lstm_crf code to do ner task based on BERT model, the loss stoped at 20 when batchsize=32, max_sequence_len=50 and num_tags=5, however when I use another code based on the same theory, it worked well. the main difference is that code use basic lstmcell, I can not understand. could you help me? by the way, I have trained the dataset use your lstm_crf code , it worked well.
**update
I have soved this problem, thank you
Thanks for your clear code and instruction!
I run a code on CONLL2003 without modifying other hyper-parameter and structure, the F1 score of testb is about to 90.6 which is not reaching the your report(91.2) (at char CNN + LSTM+CRF)
also, LSTM+CRF shows the 89.xx~ 90.0x. I got a little bit lower scores comparing with your reports.
I was wondering if it needed to change the default hyper-parameter.
Could your provide some insight about this result?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.