Coder Social home page Coder Social logo

guillaumegenthial / tf_ner Goto Github PK

View Code? Open in Web Editor NEW
927.0 43.0 274.0 147 KB

Simple and Efficient Tensorflow implementations of NER models with tf.estimator and tf.data

License: Apache License 2.0

Python 87.68% Perl 12.13% Makefile 0.20%
ner named-entity-recognition conll-2003 lstm-crf state-of-the-art tensorflow bi-lstm-crf exponential-moving-average glove character-embeddings

tf_ner's People

Contributors

guillaumegenthial avatar nh-99 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tf_ner's Issues

ascii encoding reading glove

Hey Guillame, really excellent repo! I came across a minor issue with your code on macOS using Python 3.6.1 and the most recent version of GloVe 840B 300d (as of today).

In build_glove.py, the line: with Path('glove.840B.300d.txt').open() as f: implicitly reads in the file as ASCII encoded which apparently doesn't play nice with however my stuff is set up. It can be remedied with the following code:

with open(Path('glove.840B.300d.txt'), 'rb') as f:
        for line_idx, line in enumerate(f):
            line = line.decode('utf-8')
...

Happy to submit a PR for this or else maybe you can just shove it in at your leisure. Thanks again for all your hard work

TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed.

I would be grateful if you could help me with this problem.
I tried to execute model training with example data and received the error:

Traceback (most recent call last):
File "/opt/project/models/lstm_crf/main.py", line 171, in
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 471, in train_and_evaluate
return executor.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 610, in run
return self.run_local()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 711, in run_local
saving_listeners=saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 356, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1211, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1169, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/opt/project/models/lstm_crf/main.py", line 117, in model_fn
'precision': tf.metrics.precision(tags, pred_ids, num_tags, indices, weights),
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/metrics_impl.py", line 2024, in precision
if updates_collections:
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 669, in bool
raise TypeError("Using a tf.Tensor as a Python bool is not allowed. "
TypeError: Using a tf.Tensor as a Python bool is not allowed. Use if t is not None: instead of if t: to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.

I'm using Docker with tensorflow:latest-gpu-py3.

is there an efficient way to predict?

I have over 60,000,000 sequences to analysis, when I use this way to predict it will takes more than 1 second per sequence, I want to speed it up, is it possible?

Conceptual issue in character embeddings

I have this conceptual doubt in the part where we are obtaining word level representations from characters using the final output of BiLSTM network. We are initializing the character embeddings using xavier_initialization which just ensures that the cells do not saturate. So, how do these random embeddings capture any meaningful information? And how is this network trained or is it unsupervised?

Is the evaluation metric the same as the ones in the papers?

Hi,

Thanks for making a new version with tf.data and I was wondering if your reported performance is evaluated with the entity-level (i.e. span-level) P/R/F1. It looks like you are using the token-level F1, which can be different from the mainstream span-level metric in the papers.

Custom Entity Recognition

How to train custom entity labels other than PER, LOC, ORG & MISC?

I need entities like "total amount" from a document.

chunking task

Hi, thanks for great works! I have a question, the code also works for chunking or it needs extra efforts to adapt it for chunking task.
Thanks

Why i got estimator error? I do not change any code.

runfile('D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf/main.py', wdir='D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf')
Using config: {'_model_dir': 'results/model', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 120, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001698906BD30>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
Traceback (most recent call last):

File "", line 1, in
runfile('D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf/main.py', wdir='D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf')

File "D:\ComputerSoftwares\Anaconda\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 786, in runfile
execfile(filename, namespace)

File "D:\ComputerSoftwares\Anaconda\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "D:/coding/jupyter workplace/tf_ner-master/models/lstm_crf/main.py", line 173, in
estimator, 'f1', 500, min_steps=8000, run_every_secs=120)

File "D:\ComputerSoftwares\Anaconda\lib\site-packages\tensorflow_estimator\contrib\estimator\python\estimator\early_stopping.py", line 266, in stop_if_no_increase_hook
run_every_steps=run_every_steps)

File "D:\ComputerSoftwares\Anaconda\lib\site-packages\tensorflow_estimator\contrib\estimator\python\estimator\early_stopping.py", line 422, in _stop_if_no_metric_improvement_hook
run_every_steps=run_every_steps)

File "D:\ComputerSoftwares\Anaconda\lib\site-packages\tensorflow_estimator\contrib\estimator\python\estimator\early_stopping.py", line 88, in make_early_stopping_hook
'Got: {}'.format(type(estimator)))

TypeError: estimator must have type tf.estimator.Estimator. Got: <class 'tensorflow.python.estimator.estimator.Estimator'>

Prediction shows Inside before Begin or Without Begin for some predictions

@guillaumegenthial
training Dataset used was In B-I-O format Begin Inside Out format
but for some of the predictions it shows I-LOC without B-LOC
If I assume correctly, Inside should be always after Begin ... Any thoughts where it is going wrong?
Is there any way to control this transitions

Same problem with sequence_tagging implementation and this one
Any suggestions where to control these transitions ?

Handling empty annotations neural NER

Consider the example,

Travel to dummylocation BIO annotation would be O O B_LOC

How to make the neural network to understand that the content inside the is dummy ?

Is it possible ?

Word Embedding

Is it possible to use an other word embedding than glove (for other languages)?

couldn't get F1 91.21

Thanks for your clear code and instruction!
I run a code on CONLL2003 without modifying other hyper-parameter and structure, the F1 score of testb is about to 90.6 which is not reaching the your report(91.2) (at char CNN + LSTM+CRF)
also, LSTM+CRF shows the 89.xx~ 90.0x. I got a little bit lower scores comparing with your reports.
I was wondering if it needed to change the default hyper-parameter.
Could your provide some insight about this result?

What should be the `serving_input_receiver_fn` if we want to use `estimator.export_savedmodel`?

If we really want to use export_savedmodel(serving_input_receiver_fn=?, export_dir_base=?) using custom estimator, (Ignoring the checkpoints) for the production, what it should be? Here's an example of my scripts:

UPDATE:
#2

def serving_input_fn(hyperparameters=None):
    feature_spec = {
        'foo': tf.placeholder(dtype=tf.string, shape=[None, None]),
        'bar': tf.placeholder(tf.int32, [None]),
    }
    return tf.estimator.export.ServingInputReceiver(receiver_tensors=feature_spec,
                                                    features=feature_spec)

And adding more line in model_fn for converting from dict to tensor.
This solution saves the model using estimator.export_savedmodel(export_dir_base='./Result/', serving_input_receiver_fn=serving_input_fn()).

Why must use "nwords"? It makes the models harder to serve for inputs from database

Currently, the learnt models require the nwords parameter, along with the main words input, when serving the models with tensorflow server.

It would be very helpful if this requirement of nwords parameter can be omitted, so that only the words array can be streamed directly from database sources to the tensorflow server running the learnt models.

What does it mean ? - Results score

seleccion_024

Well as you can see, we have in "Francisco" I-LOC B-LOC What does it mean? What is the meaning of I-LOC and B-LOC? i don't know if the meaning of I-LOC is that tag was the correct tag, and B-LOC was the the predicted tag, or maybe another example Moscow B-LOC B-LOC.

Thank you!

The pred_ids of `<pad>` is always zero

When estimator. predict() is used, the pred_ids of <pad> is always zero.
However, if serve.py is used to predict one sample, the pred_ids of <pad> will be the correct value.
Can you or someone explain this problem? thx

Load in AndroidStudio

Hi, how can i load this model in Android Studio. Always show this error "buffer with XX elements is not compatible with a Tensor with shape[0, 0, 0]"

Huge error on Prediction (NER)

I created my own vocabulary and tags, I ran the code with this parameters:

{
"batch_size": 2,
"buffer": 15000,
"chars": "vocab.chars.txt",
"dim": 300,
"dim_chars": 100,
"dropout": 0.3,
"epochs": 25,
"filters": 50,
"glove": "glove.npz",
"kernel_size": 3,
"lstm_size": 100,
"num_oov_buckets": 1,
"tags": "vocab.tags.txt",
"words": "vocab.words.txt"
}

After 8 hours of training I got my results:
Saving dict for global step 9534: acc = 0.96229786, f1 = 0.8532131, global_step = 9534, loss = 36.943344, precision = 0.8494137$
Saving 'checkpoint_path' summary for global step 9534: results/model/model.ckpt-9534

But when I tried to predict with the same test dataset, the model is not predicting, for example in test we have Bora B-NAME B-NAME and when I tried to pass the same sentence, the algorithm is not predicting that entity. The problem is the algorithm is saying that the accuracy is 0.962297 and f1 0.8532131, but in reality is detecting only 38%, so the accuracy is not 0.85 is only 0.38.

What could be the problem?
Do you have an idea?

Thank you!

Could you provide the training data?

As you said, "Note that the example dataset is here for debugging purposes only and won't be of much use to train an actual model". However, I really need your training data to use the NER model.

Although I can get them from some datasets like CONLL, I think it is helpful (at least time-saving) not just for me to share the datasets which can be used in your code directly.

InvalidArgumentError

trying to use model on custom data

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [106,300] rhs shape= [23,300]
[[node save/Assign (defined at /home/anujc/.conda/envs/py37/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py:1403) ]]

can you please guide

Correct export.py line 41

HI Guillaume,

This is an amazing repo - great work!

found a minor error -
estimator.export_saved_model('saved_model', serving_input_receiver_fn) -> estimator.export_savedmodel

Keep up the great work! I'm a fan

RuntimeError: There was no new checkpoint after the training. Eval status: missing checkpoint

While running this model on the CONLL2003 data ( TF 1.11) , i am getting this error

global_step/sec: 3.00248
loss = 5.6152534, step = 101 (33.300 sec)
global_step/sec: 2.88663
loss = 4.2953496, step = 201 (34.641 sec)
global_step/sec: 3.08292
loss = 3.5392659, step = 301 (32.438 sec)
Saving checkpoints for 344 into results\model\model.ckpt.
Estimator is not trained yet. Will start an evaluation when a checkpoint is ready.

RuntimeError: There was no new checkpoint after the training. Eval status: missing checkpoint

Could you please paste a picture of your model weights directory ?.

thanks

**AttributeError: 'Estimator' object has no attribute 'eval_dir'**

/Users/me/tf_ner/venv/bin/python /Users/bxr8516/tf_ner/models/lstm_crf/main.py
Using config: {'_model_dir': 'results/model', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 120, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1183a3400>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Traceback (most recent call last):
File "/Users/me/tf_ner/models/lstm_crf/main.py", line 166, in
Path(estimator.eval_dir()).mkdir(parents=True, exist_ok=True)
AttributeError: 'Estimator' object has no attribute 'eval_dir'

Process finished with exit code 1

I am getting above error , I have installed tensorflow==1.6

Error while executing "tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)"

Running training and evaluation locally (non-distributed).
Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 120.
Calling model_fn.

TypeErrorTraceback (most recent call last)
----> 1 tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

TypeError: Using a tf.Tensor as a Python bool is not allowed. Use if t is not None: instead of if t: to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.


Did anyone get this error ?
PS : Using TF 1.11

Label probabilities for the CRF layer

Hi,
Thanks for sharing this great implementation.
I know it is possible to get the label probabilities using forward backward algorithm in CRFs. I am finding some difficulties in implementing/modifying the default CRF implementation in tensorflow. For calculation of the partition function, they have only used the forward (message passing) algorithm. Do you have any experience or ideas about how the forward-backward algorithm could be implemented in tf?

the predicted entity not end with 'E' tag

i got the prediction like '[["1", "B_t_h"], ["9", "M_t_h"], ["4", "M_t_h"], ["0", "M_t_h"]]', the tag of number 0 should be 'E_t_h', is there any explation for this problem? and how can i fix the prediction?

and i also got the problem that all of the predictions most likely to predicted as 't_h', is there any good way to adjust it?

looking for your reply, thanks!

I got a terrible result when I use the lstm_crf code to do ner task based on BERT model, so did you?

I got a terrible result when I use the lstm_crf code to do ner task based on BERT model, the loss stoped at 20 when batchsize=32, max_sequence_len=50 and num_tags=5, however when I use another code based on the same theory, it worked well. the main difference is that code use basic lstmcell, I can not understand. could you help me? by the way, I have trained the dataset use your lstm_crf code , it worked well.

**update
I have soved this problem, thank you

OutOfRangeError (see above for traceback): End of sequence

I am trying to run the code with out making any changes and hitting below error from (chars_conv_lstm_crf/main.py)

OutOfRangeError (see above for traceback): End of sequence
[[{{node IteratorGetNext}} = IteratorGetNext_class=["loc:@cond/sub/Switch"], output_shapes=[[?,?], [?], [?,?,?], [?,?], [?,?]], output_types=[DT_STRING, DT_INT32, DT_STRING, DT_INT32, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

TF version : 1.11.0rc2

Do you know why this error appears ?.

thanks

Pre-trained models

Thanks for this amazing work!
Recently FLAIR (PyTorch) has release state of the art models for NER in several languages. It would be worth if you will provide here pre-trained models as well!
Thank you

InvalidArgument `labels` contains negative values

What can be the problem?
Did anyone have something like this?

Saving checkpoints for 0 into results/model/model.ckpt.

Traceback (most recent call last): . . . .

InvalidArgumentError (see above for traceback): assertion failed: [`labels` contains negative values] [Condition x >= 0 did not hold element-wise:] [x (Reshape_5:0) = ] [8 12 -1...]
	 [[{{node confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert}} = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/Switch, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_0, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_1, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_2, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/Switch_1)]]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.