kamalkraj / bert-ner-tf Goto Github PK

Named Entity Recognition with BERT using TensorFlow 2.0

License: Apache License 2.0

Python 100.00%

bert named-entity-recognition tensorflow tensorflow-2 tensorflow-2-example bert-ner inference curl postman pretrained-models

bert-ner-tf's Introduction

BERT NER

Use google BERT to do CoNLL-2003 NER !

Train model using Python and TensorFlow 2.0

ALBERT-TF2.0

BERT-SQuAD

BERT-NER-Pytorch

Requirements

python3
pip3 install -r requirements.txt

Download Pretrained Models from Tensorflow offical models

bert-base-cased unzip into bert-base-cased
bert-large-cased unzip into bert-large-cased

code for pre-trained bert from tensorflow-offical-models

Run

Single GPU

python run_ner.py --data_dir=data/ --bert_model=bert-base-cased --output_dir=out_base --max_seq_length=128 --do_train --num_train_epochs 3 --do_eval --eval_on dev

Multi GPU

python run_ner.py --data_dir=data/ --bert_model=bert-large-cased --output_dir=out_large --max_seq_length=128 --do_train --num_train_epochs 3 --multi_gpu --gpus 0,1,2,3 --do_eval --eval_on test

Result

BERT-BASE

Validation Data

             precision    recall  f1-score   support

        PER     0.9677    0.9756    0.9716      1842
        LOC     0.9671    0.9592    0.9631      1837
       MISC     0.8872    0.9132    0.9001       922
        ORG     0.9191    0.9314    0.9252      1341

avg / total     0.9440    0.9509    0.9474      5942

Test Data

             precision    recall  f1-score   support

        ORG     0.8773    0.9037    0.8903      1661
        PER     0.9646    0.9592    0.9619      1617
       MISC     0.7691    0.8305    0.7986       702
        LOC     0.9333    0.9305    0.9319      1668

avg / total     0.9053    0.9184    0.9117      5648

Pretrained model download from here

BERT-LARGE

Validation Data

             precision    recall  f1-score   support

        ORG     0.9290    0.9374    0.9332      1341
       MISC     0.8967    0.9230    0.9097       922
        PER     0.9713    0.9734    0.9723      1842
        LOC     0.9748    0.9701    0.9724      1837

avg / total     0.9513    0.9564    0.9538      5942

Test Data

             precision    recall  f1-score   support

        LOC     0.9256    0.9329    0.9292      1668
       MISC     0.7891    0.8419    0.8146       702
        PER     0.9647    0.9623    0.9635      1617
        ORG     0.8903    0.9133    0.9016      1661

avg / total     0.9094    0.9242    0.9167      5648

Pretrained model download from here

Inference

from bert import Ner

model = Ner("out_base/")

output = model.predict("Steve went to Paris")

print(output)
'''
    [
        {
            "confidence": 0.9981840252876282,
            "tag": "B-PER",
            "word": "Steve"
        },
        {
            "confidence": 0.9998939037322998,
            "tag": "O",
            "word": "went"
        },
        {
            "confidence": 0.999891996383667,
            "tag": "O",
            "word": "to"
        },
        {
            "confidence": 0.9991968274116516,
            "tag": "B-LOC",
            "word": "Paris"
        }
    ]
'''

Deploy REST-API

BERT NER model deployed as rest api

python api.py

API will be live at 0.0.0.0:8000 endpoint predict

cURL request

curl -X POST http://0.0.0.0:8000/predict -H 'Content-Type: application/json' -d '{ "text": "Steve went to Paris" }'

Output

{
    "result": [
        {
            "confidence": 0.9981840252876282,
            "tag": "B-PER",
            "word": "Steve"
        },
        {
            "confidence": 0.9998939037322998,
            "tag": "O",
            "word": "went"
        },
        {
            "confidence": 0.999891996383667,
            "tag": "O",
            "word": "to"
        },
        {
            "confidence": 0.9991968274116516,
            "tag": "B-LOC",
            "word": "Paris"
        }
    ]
}

cURL

Postman

Pytorch version

https://github.com/kamalkraj/BERT-NER

bert-ner-tf's People

Contributors

Stargazers

Watchers

Forkers

austin-deccentric zxlzr tianyikenan xuchen hjfeilg xw-jia perfmjs lxgtz28 pk0912 90217 seeker1943 ravi0912 tsvetanrangelov agombert ryannetwork askonivala alvarocalle ttchnc acalle-stratio oskoa arunpoochelvan wangvince riteshksingh1709 ibrahimishag minhnn-tiny hakanaku1234 h2k tooo8g personx000 ibmxiang maganaluis kurianbenoy ilektram rajveerbeerda renogomez danieljunior youikim aiedward atomicjets adonis1022 shashankncsu manasmahanta san7988 cbalkig adeyinka-hub maniol ashishpatel26 maybeee18 wusanshou2017 ssusantachary ychen-commits steamfeifei sevenmpp likevivi cytsinghua praveenvattem romranibitvore drcyfai mulinfro azisxq bengshaoye bhuvanakundumani rajarshibhadra kabyleai jithendiran sushant-lambture yanghongkai izilotti mkhan9047

bert-ner-tf's Issues

Support for ALBERT

Hi Kamal,
Is it possible to load Albert Tf1.0 model found here - https://github.com/google-research/google-research/tree/master/albert
for NER training.

How to run Predictions after Fine-tuning.

Dear kamalkraj,
This is to first and foremost thank you for your appreciated efforts making this possible. I have a question that reads as follows:
I have fine-tuned a BERT NER model after converting a previously pre-trained model using tf1_to_keras_checkpoint_converter.y followed by tf2_checkpoint_converter.py. [ https://github.com/tensorflow/models/releases/tag/v2.0 ]
The Question:
When I try to perform predictions on a new data (test) using run_ner.py, and omit the --do_train option, the program only performs predictions on the first 128 tokens. Is there a way I can perform predictions on a file that has more than 128 tokens?

Thank you so much in advance.

Why n+1 labels?

Why did the author choose n+1 labels instead of n labels?

In computer vision i understand that the model will be trained for n+1 classes (extra class is considered as background) but the no of labels will be only n. Is it the same criteria followed here or any other reason?

Support for Multi Cased Model

I have tried to train a model with the Multi Cased model tensorflow but there is an AssertionError when the Checkpoints are loaded. I was wondering if this code supports Multi-cased model.

ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.

Hello,

I am using Tensorflow 2.9.1

Please help me to resolve this issue

File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 513, in
main()
File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 449, in main
ner.load_weights(os.path.join(args.output_dir,"model.h5"))
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\saving\hdf5_format.py", line 728, in load_weights_from_hdf5_group
raise ValueError(
ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.

Training with sequence_length = 512

Hello Kamal,
Thank you for contributing this code. I am wondering if it is possible to train, or predict with sequence length = 512.?.

Reproduce with CoNLL results

In order to reproduce the conll score reported in BERT paper (92.4 bert-base and 92.8 bert-large) one trick is to apply a truecaser on article titles (all upper case sentences) as preprocessing step for conll train/dev/test. This can be simply done with the following method.

#https://github.com/daltonfury42/truecase
#pip install truecase
import truecase
import re




# original tokens
#['FULL', 'FEES', '1.875', 'REOFFER', '99.32', 'SPREAD', '+20', 'BP']

def truecase_sentence(tokens):
   word_lst = [(w, idx) for idx, w in enumerate(tokens) if all(c.isalpha() for c in w)]
   lst = [w for w, _ in word_lst if re.match(r'\b[A-Z\.\-]+\b', w)]

   if len(lst) and len(lst) == len(word_lst):
       parts = truecase.get_true_case(' '.join(lst)).split()

       # the trucaser have its own tokenization ...
       # skip if the number of word dosen't match
       if len(parts) != len(word_lst): return tokens

       for (w, idx), nw in zip(word_lst, parts):
           tokens[idx] = nw

# truecased tokens
#['Full', 'fees', '1.875', 'Reoffer', '99.32', 'spread', '+20', 'BP']

Also, i found useful to use : very small learning rate (5e-6) \ large batch size (128) \ high epoch num (>40).

With these configurations and preprocessing, I was able to reach 92.8 with bert-large.

advice on models without tf2 pretrained checkpoints

Your code uses models with tf2 pre-trained checkpoints. However, tf2 checkpoints are only available in a few bert models: https://github.com/tensorflow/models/tree/master/official/nlp/bert

What about languages that are only available in tf1 bert checkpoints? Do you have any advice?

If i use the tf1 checkpoints, I would receive errors when loading the model.

init_checkpoint='the pretrained model checkpoint path.'     
model=tf.keras.Model() # Bert pre-trained model as feature extractor.     
checkpoint = tf.train.Checkpoint(model=model)      
checkpoint.restore(init_checkpoint)

AMD GPU or CPU Support?

Hey great work!
Question: Is it possible to run this project with and AMD GPU or at least run it with an CPU? For any help or hints, i would be glad.

UnboundLocalError: local variable 'eval_features' referenced before assignment

Following through the readme and ran the file
python run_ner.py --data_dir=data/ --bert_model=bert-base-cased --output_dir=out_base --max_seq_length=128 --do_train --num_train_epochs 3 --do_eval --eval_on dev

I get this error. This is blocking,

Traceback (most recent call last):
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 512, in
main()
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 380, in main
np.asarray([f.label_id for f in eval_features],dtype=np.int32))
UnboundLocalError: local variable 'eval_features' referenced before assignment

ResourceExhaustedError

tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[128,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:StridedSliceGrad]

How to concatenate the response values

Thank you so much for your effort,
How to concatenate the response values?
For example:
[
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "jobs"
},
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "Wozniak"
},
{
"confidence": 0.9998939037322998,
"tag": "O",
"word": "from"
},
{
"confidence": 0.999891996383667,
"tag": "B-LOC",
"word": "United"
},
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "States"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "Of"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "America"
}
]

To This

{"B-PER": "Steve Jobs", "Steve Wozniak", "B-LOC": "United States Of America"}

Thanks

assert_existing_objects_matched is causing issue while running NER Code

While restoring the pretrained Model in tensorflow==2.2.0
checkpoint.restore(init checkpoint).assert_existing_objects_matched() through error in NER training

Float32 is not json serializable

Hi @kamalkraj , Nice work , but one thing while using the api.py and curl to get the prediction i got error
float32 is not json serializable.

Tensorflow serving Request

Hi Kamal, I very much like your work and model and making as public.
Just have a request

Could you please add tenser flow serving on this model for deployment ?
I ran against google colab and got error b2low
RuntimeError: Error copying tensor to device: /job:localhost/replica:0/task:0/device:GPU:0. /job:localhost/replica:0/task:0/device:GPU:0 unknown device.

Did I do any mistake for colab setting or need to change the run_ner.py file?

Regards,
Niranjan

Exception has occurred: KeyError in bert.py at line number 94

fine tuning or embedding

this code is the fine tuning or use the bert do embedding?

The API does not take very long texts.

When sending little bit longer text to API, it will respond:

Expected size[0] in [0, 512], but got 1144 [Op:Slice]

This means you need to split texts so that they are under 500 tokens long, which is not optimal.

Opinionated HTTP server

In my point of view! This package shouldn't contain an HTTP API because you are using a very opinionated framework to expose your model via an API. Let the users decide what they want to use as an HTTP Framework.

This HTTP API isn't configurable for example Ner("out_base") parameter, exposed route, "CORS", HTTP port, response payload, and request object can be changed without package changes.

For example, I don't want to use Flask as Framework and to install any dependencies regardings Flask so, please modify the python requirements and get rid of this api.py.

Dates not recognized

I tested the model on different dates (Saturday, November 12, etc) and they were not classified as dates. Ran on both large and normal BERT. Any ideas why?

kamalkraj / bert-ner-tf Goto Github PK

bert-ner-tf's Introduction

BERT NER

Requirements

Download Pretrained Models from Tensorflow offical models

Run

Single GPU

Multi GPU

Result

BERT-BASE

Validation Data

Test Data

Pretrained model download from here

BERT-LARGE

Validation Data

Test Data

Pretrained model download from here

Inference

Deploy REST-API

cURL request

cURL

Postman

Pytorch version

bert-ner-tf's People

Contributors

Stargazers

Watchers

Forkers

bert-ner-tf's Issues

Recommend Projects

Recommend Topics

Recommend Org