kamalkraj / bert-ner-tf Goto Github PK

Named Entity Recognition with BERT using TensorFlow 2.0

License: Apache License 2.0

Python 100.00%

bert named-entity-recognition tensorflow tensorflow-2 tensorflow-2-example bert-ner inference curl postman pretrained-models

bert-ner-tf's Issues

Support for ALBERT

Hi Kamal,
Is it possible to load Albert Tf1.0 model found here - https://github.com/google-research/google-research/tree/master/albert
for NER training.

Training with sequence_length = 512

Hello Kamal,
Thank you for contributing this code. I am wondering if it is possible to train, or predict with sequence length = 512.?.

Exception has occurred: KeyError in bert.py at line number 94

In my point of view! This package shouldn't contain an HTTP API because you are using a very opinionated framework to expose your model via an API. Let the users decide what they want to use as an HTTP Framework.

This HTTP API isn't configurable for example Ner("out_base") parameter, exposed route, "CORS", HTTP port, response payload, and request object can be changed without package changes.

For example, I don't want to use Flask as Framework and to install any dependencies regardings Flask so, please modify the python requirements and get rid of this api.py.

ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.

Hello,

I am using Tensorflow 2.9.1

Please help me to resolve this issue

File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 513, in
main()
File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 449, in main
ner.load_weights(os.path.join(args.output_dir,"model.h5"))
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\saving\hdf5_format.py", line 728, in load_weights_from_hdf5_group
raise ValueError(
ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.

Why n+1 labels?

Why did the author choose n+1 labels instead of n labels?

In computer vision i understand that the model will be trained for n+1 classes (extra class is considered as background) but the no of labels will be only n. Is it the same criteria followed here or any other reason?

How to concatenate the response values

Thank you so much for your effort,
How to concatenate the response values?
For example:
[
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "jobs"
},
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "Wozniak"
},
{
"confidence": 0.9998939037322998,
"tag": "O",
"word": "from"
},
{
"confidence": 0.999891996383667,
"tag": "B-LOC",
"word": "United"
},
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "States"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "Of"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "America"
}
]

To This

{"B-PER": "Steve Jobs", "Steve Wozniak", "B-LOC": "United States Of America"}

Thanks

assert_existing_objects_matched is causing issue while running NER Code

While restoring the pretrained Model in tensorflow==2.2.0
checkpoint.restore(init checkpoint).assert_existing_objects_matched() through error in NER training

Float32 is not json serializable

Hi @kamalkraj , Nice work , but one thing while using the api.py and curl to get the prediction i got error
float32 is not json serializable.

Reproduce with CoNLL results

In order to reproduce the conll score reported in BERT paper (92.4 bert-base and 92.8 bert-large) one trick is to apply a truecaser on article titles (all upper case sentences) as preprocessing step for conll train/dev/test. This can be simply done with the following method.

#https://github.com/daltonfury42/truecase
#pip install truecase
import truecase
import re




# original tokens
#['FULL', 'FEES', '1.875', 'REOFFER', '99.32', 'SPREAD', '+20', 'BP']

def truecase_sentence(tokens):
   word_lst = [(w, idx) for idx, w in enumerate(tokens) if all(c.isalpha() for c in w)]
   lst = [w for w, _ in word_lst if re.match(r'\b[A-Z\.\-]+\b', w)]

   if len(lst) and len(lst) == len(word_lst):
       parts = truecase.get_true_case(' '.join(lst)).split()

       # the trucaser have its own tokenization ...
       # skip if the number of word dosen't match
       if len(parts) != len(word_lst): return tokens

       for (w, idx), nw in zip(word_lst, parts):
           tokens[idx] = nw

# truecased tokens
#['Full', 'fees', '1.875', 'Reoffer', '99.32', 'spread', '+20', 'BP']

Also, i found useful to use : very small learning rate (5e-6) \ large batch size (128) \ high epoch num (>40).

With these configurations and preprocessing, I was able to reach 92.8 with bert-large.

AMD GPU or CPU Support?

Hey great work!
Question: Is it possible to run this project with and AMD GPU or at least run it with an CPU? For any help or hints, i would be glad.

advice on models without tf2 pretrained checkpoints

Your code uses models with tf2 pre-trained checkpoints. However, tf2 checkpoints are only available in a few bert models: https://github.com/tensorflow/models/tree/master/official/nlp/bert

What about languages that are only available in tf1 bert checkpoints? Do you have any advice?

If i use the tf1 checkpoints, I would receive errors when loading the model.

init_checkpoint='the pretrained model checkpoint path.'     
model=tf.keras.Model() # Bert pre-trained model as feature extractor.     
checkpoint = tf.train.Checkpoint(model=model)      
checkpoint.restore(init_checkpoint)

Support for Multi Cased Model

I have tried to train a model with the Multi Cased model tensorflow but there is an AssertionError when the Checkpoints are loaded. I was wondering if this code supports Multi-cased model.

fine tuning or embedding

this code is the fine tuning or use the bert do embedding?

Dates not recognized

I tested the model on different dates (Saturday, November 12, etc) and they were not classified as dates. Ran on both large and normal BERT. Any ideas why?

How to run Predictions after Fine-tuning.

Dear kamalkraj,
This is to first and foremost thank you for your appreciated efforts making this possible. I have a question that reads as follows:
I have fine-tuned a BERT NER model after converting a previously pre-trained model using tf1_to_keras_checkpoint_converter.y followed by tf2_checkpoint_converter.py. [ https://github.com/tensorflow/models/releases/tag/v2.0 ]
The Question:
When I try to perform predictions on a new data (test) using run_ner.py, and omit the --do_train option, the program only performs predictions on the first 128 tokens. Is there a way I can perform predictions on a file that has more than 128 tokens?

Thank you so much in advance.

UnboundLocalError: local variable 'eval_features' referenced before assignment

Following through the readme and ran the file
python run_ner.py --data_dir=data/ --bert_model=bert-base-cased --output_dir=out_base --max_seq_length=128 --do_train --num_train_epochs 3 --do_eval --eval_on dev

I get this error. This is blocking,

Traceback (most recent call last):
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 512, in
main()
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 380, in main
np.asarray([f.label_id for f in eval_features],dtype=np.int32))
UnboundLocalError: local variable 'eval_features' referenced before assignment

ResourceExhaustedError

tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[128,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:StridedSliceGrad]

The API does not take very long texts.

When sending little bit longer text to API, it will respond:

Expected size[0] in [0, 512], but got 1144 [Op:Slice]

This means you need to split texts so that they are under 500 tokens long, which is not optimal.

Tensorflow serving Request

Hi Kamal, I very much like your work and model and making as public.
Just have a request

Could you please add tenser flow serving on this model for deployment ?
I ran against google colab and got error b2low
RuntimeError: Error copying tensor to device: /job:localhost/replica:0/task:0/device:GPU:0. /job:localhost/replica:0/task:0/device:GPU:0 unknown device.

Did I do any mistake for colab setting or need to change the run_ner.py file?

Regards,
Niranjan

kamalkraj / bert-ner-tf Goto Github PK

bert-ner-tf's Issues

Recommend Projects

Recommend Topics

Recommend Org