Coder Social home page Coder Social logo

kamalkraj / bert-ner-tf Goto Github PK

View Code? Open in Web Editor NEW
214.0 7.0 71.0 1.19 MB

Named Entity Recognition with BERT using TensorFlow 2.0

License: Apache License 2.0

Python 100.00%
bert named-entity-recognition tensorflow tensorflow-2 tensorflow-2-example bert-ner inference curl postman pretrained-models

bert-ner-tf's Introduction

BERT NER

Use google BERT to do CoNLL-2003 NER !

Train model using Python and TensorFlow 2.0

ALBERT-TF2.0

BERT-SQuAD

BERT-NER-Pytorch

Requirements

  • python3
  • pip3 install -r requirements.txt

Download Pretrained Models from Tensorflow offical models

code for pre-trained bert from tensorflow-offical-models

Run

Single GPU

python run_ner.py --data_dir=data/ --bert_model=bert-base-cased --output_dir=out_base --max_seq_length=128 --do_train --num_train_epochs 3 --do_eval --eval_on dev

Multi GPU

python run_ner.py --data_dir=data/ --bert_model=bert-large-cased --output_dir=out_large --max_seq_length=128 --do_train --num_train_epochs 3 --multi_gpu --gpus 0,1,2,3 --do_eval --eval_on test

Result

BERT-BASE

Validation Data

             precision    recall  f1-score   support

        PER     0.9677    0.9756    0.9716      1842
        LOC     0.9671    0.9592    0.9631      1837
       MISC     0.8872    0.9132    0.9001       922
        ORG     0.9191    0.9314    0.9252      1341

avg / total     0.9440    0.9509    0.9474      5942

Test Data

             precision    recall  f1-score   support

        ORG     0.8773    0.9037    0.8903      1661
        PER     0.9646    0.9592    0.9619      1617
       MISC     0.7691    0.8305    0.7986       702
        LOC     0.9333    0.9305    0.9319      1668

avg / total     0.9053    0.9184    0.9117      5648

Pretrained model download from here

BERT-LARGE

Validation Data

             precision    recall  f1-score   support

        ORG     0.9290    0.9374    0.9332      1341
       MISC     0.8967    0.9230    0.9097       922
        PER     0.9713    0.9734    0.9723      1842
        LOC     0.9748    0.9701    0.9724      1837

avg / total     0.9513    0.9564    0.9538      5942

Test Data

             precision    recall  f1-score   support

        LOC     0.9256    0.9329    0.9292      1668
       MISC     0.7891    0.8419    0.8146       702
        PER     0.9647    0.9623    0.9635      1617
        ORG     0.8903    0.9133    0.9016      1661

avg / total     0.9094    0.9242    0.9167      5648

Pretrained model download from here

Inference

from bert import Ner

model = Ner("out_base/")

output = model.predict("Steve went to Paris")

print(output)
'''
    [
        {
            "confidence": 0.9981840252876282,
            "tag": "B-PER",
            "word": "Steve"
        },
        {
            "confidence": 0.9998939037322998,
            "tag": "O",
            "word": "went"
        },
        {
            "confidence": 0.999891996383667,
            "tag": "O",
            "word": "to"
        },
        {
            "confidence": 0.9991968274116516,
            "tag": "B-LOC",
            "word": "Paris"
        }
    ]
'''

Deploy REST-API

BERT NER model deployed as rest api

python api.py

API will be live at 0.0.0.0:8000 endpoint predict

cURL request

curl -X POST http://0.0.0.0:8000/predict -H 'Content-Type: application/json' -d '{ "text": "Steve went to Paris" }'

Output

{
    "result": [
        {
            "confidence": 0.9981840252876282,
            "tag": "B-PER",
            "word": "Steve"
        },
        {
            "confidence": 0.9998939037322998,
            "tag": "O",
            "word": "went"
        },
        {
            "confidence": 0.999891996383667,
            "tag": "O",
            "word": "to"
        },
        {
            "confidence": 0.9991968274116516,
            "tag": "B-LOC",
            "word": "Paris"
        }
    ]
}

cURL

curl output image

Postman

postman output image

Pytorch version

bert-ner-tf's People

Contributors

kamalkraj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bert-ner-tf's Issues

Reproduce with CoNLL results

In order to reproduce the conll score reported in BERT paper (92.4 bert-base and 92.8 bert-large) one trick is to apply a truecaser on article titles (all upper case sentences) as preprocessing step for conll train/dev/test. This can be simply done with the following method.

#https://github.com/daltonfury42/truecase
#pip install truecase
import truecase
import re




# original tokens
#['FULL', 'FEES', '1.875', 'REOFFER', '99.32', 'SPREAD', '+20', 'BP']

def truecase_sentence(tokens):
   word_lst = [(w, idx) for idx, w in enumerate(tokens) if all(c.isalpha() for c in w)]
   lst = [w for w, _ in word_lst if re.match(r'\b[A-Z\.\-]+\b', w)]

   if len(lst) and len(lst) == len(word_lst):
       parts = truecase.get_true_case(' '.join(lst)).split()

       # the trucaser have its own tokenization ...
       # skip if the number of word dosen't match
       if len(parts) != len(word_lst): return tokens

       for (w, idx), nw in zip(word_lst, parts):
           tokens[idx] = nw

# truecased tokens
#['Full', 'fees', '1.875', 'Reoffer', '99.32', 'spread', '+20', 'BP']

Also, i found useful to use : very small learning rate (5e-6) \ large batch size (128) \ high epoch num (>40).

With these configurations and preprocessing, I was able to reach 92.8 with bert-large.

How to run Predictions after Fine-tuning.

Dear kamalkraj,
This is to first and foremost thank you for your appreciated efforts making this possible. I have a question that reads as follows:
I have fine-tuned a BERT NER model after converting a previously pre-trained model using tf1_to_keras_checkpoint_converter.y followed by tf2_checkpoint_converter.py. [ https://github.com/tensorflow/models/releases/tag/v2.0 ]
The Question:
When I try to perform predictions on a new data (test) using run_ner.py, and omit the --do_train option, the program only performs predictions on the first 128 tokens. Is there a way I can perform predictions on a file that has more than 128 tokens?

Thank you so much in advance.

The API does not take very long texts.

When sending little bit longer text to API, it will respond:

Expected size[0] in [0, 512], but got 1144 [Op:Slice]

This means you need to split texts so that they are under 500 tokens long, which is not optimal.

Why n+1 labels?

Why did the author choose n+1 labels instead of n labels?

In computer vision i understand that the model will be trained for n+1 classes (extra class is considered as background) but the no of labels will be only n. Is it the same criteria followed here or any other reason?

advice on models without tf2 pretrained checkpoints

Your code uses models with tf2 pre-trained checkpoints. However, tf2 checkpoints are only available in a few bert models: https://github.com/tensorflow/models/tree/master/official/nlp/bert

What about languages that are only available in tf1 bert checkpoints? Do you have any advice?

If i use the tf1 checkpoints, I would receive errors when loading the model.

init_checkpoint='the pretrained model checkpoint path.'     
model=tf.keras.Model() # Bert pre-trained model as feature extractor.     
checkpoint = tf.train.Checkpoint(model=model)      
checkpoint.restore(init_checkpoint)

Tensorflow serving Request

Hi Kamal, I very much like your work and model and making as public.
Just have a request

  1. Could you please add tenser flow serving on this model for deployment ?
  2. I ran against google colab and got error b2low
    RuntimeError: Error copying tensor to device: /job:localhost/replica:0/task:0/device:GPU:0. /job:localhost/replica:0/task:0/device:GPU:0 unknown device.

Did I do any mistake for colab setting or need to change the run_ner.py file?

Regards,
Niranjan

How to concatenate the response values

Thank you so much for your effort,
How to concatenate the response values?
For example:
[
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "jobs"
},
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "Wozniak"
},
{
"confidence": 0.9998939037322998,
"tag": "O",
"word": "from"
},
{
"confidence": 0.999891996383667,
"tag": "B-LOC",
"word": "United"
},
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "States"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "Of"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "America"
}
]

To This

{"B-PER": "Steve Jobs", "Steve Wozniak", "B-LOC": "United States Of America"}

Thanks

ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.

Hello,

I am using Tensorflow 2.9.1

Please help me to resolve this issue

File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 513, in
main()
File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 449, in main
ner.load_weights(os.path.join(args.output_dir,"model.h5"))
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\saving\hdf5_format.py", line 728, in load_weights_from_hdf5_group
raise ValueError(
ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.

AMD GPU or CPU Support?

Hey great work!
Question: Is it possible to run this project with and AMD GPU or at least run it with an CPU? For any help or hints, i would be glad.

Dates not recognized

I tested the model on different dates (Saturday, November 12, etc) and they were not classified as dates. Ran on both large and normal BERT. Any ideas why?

UnboundLocalError: local variable 'eval_features' referenced before assignment

Following through the readme and ran the file
python run_ner.py --data_dir=data/ --bert_model=bert-base-cased --output_dir=out_base --max_seq_length=128 --do_train --num_train_epochs 3 --do_eval --eval_on dev

I get this error. This is blocking,

Traceback (most recent call last):
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 512, in
main()
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 380, in main
np.asarray([f.label_id for f in eval_features],dtype=np.int32))
UnboundLocalError: local variable 'eval_features' referenced before assignment

Opinionated HTTP server

In my point of view! This package shouldn't contain an HTTP API because you are using a very opinionated framework to expose your model via an API. Let the users decide what they want to use as an HTTP Framework.

This HTTP API isn't configurable for example Ner("out_base") parameter, exposed route, "CORS", HTTP port, response payload, and request object can be changed without package changes.

For example, I don't want to use Flask as Framework and to install any dependencies regardings Flask so, please modify the python requirements and get rid of this api.py.

Support for Multi Cased Model

I have tried to train a model with the Multi Cased model tensorflow but there is an AssertionError when the Checkpoints are loaded. I was wondering if this code supports Multi-cased model.

ResourceExhaustedError

tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[128,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:StridedSliceGrad]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.