Coder Social home page Coder Social logo

kamalkraj / bert-ner-tf Goto Github PK

View Code? Open in Web Editor NEW
213.0 7.0 71.0 1.19 MB

Named Entity Recognition with BERT using TensorFlow 2.0

License: Apache License 2.0

Python 100.00%
bert named-entity-recognition tensorflow tensorflow-2 tensorflow-2-example bert-ner inference curl postman pretrained-models

bert-ner-tf's Issues

Opinionated HTTP server

In my point of view! This package shouldn't contain an HTTP API because you are using a very opinionated framework to expose your model via an API. Let the users decide what they want to use as an HTTP Framework.

This HTTP API isn't configurable for example Ner("out_base") parameter, exposed route, "CORS", HTTP port, response payload, and request object can be changed without package changes.

For example, I don't want to use Flask as Framework and to install any dependencies regardings Flask so, please modify the python requirements and get rid of this api.py.

ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.

Hello,

I am using Tensorflow 2.9.1

Please help me to resolve this issue

File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 513, in
main()
File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 449, in main
ner.load_weights(os.path.join(args.output_dir,"model.h5"))
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\saving\hdf5_format.py", line 728, in load_weights_from_hdf5_group
raise ValueError(
ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.

Why n+1 labels?

Why did the author choose n+1 labels instead of n labels?

In computer vision i understand that the model will be trained for n+1 classes (extra class is considered as background) but the no of labels will be only n. Is it the same criteria followed here or any other reason?

How to concatenate the response values

Thank you so much for your effort,
How to concatenate the response values?
For example:
[
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "jobs"
},
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "Wozniak"
},
{
"confidence": 0.9998939037322998,
"tag": "O",
"word": "from"
},
{
"confidence": 0.999891996383667,
"tag": "B-LOC",
"word": "United"
},
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "States"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "Of"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "America"
}
]

To This

{"B-PER": "Steve Jobs", "Steve Wozniak", "B-LOC": "United States Of America"}

Thanks

Reproduce with CoNLL results

In order to reproduce the conll score reported in BERT paper (92.4 bert-base and 92.8 bert-large) one trick is to apply a truecaser on article titles (all upper case sentences) as preprocessing step for conll train/dev/test. This can be simply done with the following method.

#https://github.com/daltonfury42/truecase
#pip install truecase
import truecase
import re




# original tokens
#['FULL', 'FEES', '1.875', 'REOFFER', '99.32', 'SPREAD', '+20', 'BP']

def truecase_sentence(tokens):
   word_lst = [(w, idx) for idx, w in enumerate(tokens) if all(c.isalpha() for c in w)]
   lst = [w for w, _ in word_lst if re.match(r'\b[A-Z\.\-]+\b', w)]

   if len(lst) and len(lst) == len(word_lst):
       parts = truecase.get_true_case(' '.join(lst)).split()

       # the trucaser have its own tokenization ...
       # skip if the number of word dosen't match
       if len(parts) != len(word_lst): return tokens

       for (w, idx), nw in zip(word_lst, parts):
           tokens[idx] = nw

# truecased tokens
#['Full', 'fees', '1.875', 'Reoffer', '99.32', 'spread', '+20', 'BP']

Also, i found useful to use : very small learning rate (5e-6) \ large batch size (128) \ high epoch num (>40).

With these configurations and preprocessing, I was able to reach 92.8 with bert-large.

AMD GPU or CPU Support?

Hey great work!
Question: Is it possible to run this project with and AMD GPU or at least run it with an CPU? For any help or hints, i would be glad.

advice on models without tf2 pretrained checkpoints

Your code uses models with tf2 pre-trained checkpoints. However, tf2 checkpoints are only available in a few bert models: https://github.com/tensorflow/models/tree/master/official/nlp/bert

What about languages that are only available in tf1 bert checkpoints? Do you have any advice?

If i use the tf1 checkpoints, I would receive errors when loading the model.

init_checkpoint='the pretrained model checkpoint path.'     
model=tf.keras.Model() # Bert pre-trained model as feature extractor.     
checkpoint = tf.train.Checkpoint(model=model)      
checkpoint.restore(init_checkpoint)

Support for Multi Cased Model

I have tried to train a model with the Multi Cased model tensorflow but there is an AssertionError when the Checkpoints are loaded. I was wondering if this code supports Multi-cased model.

Dates not recognized

I tested the model on different dates (Saturday, November 12, etc) and they were not classified as dates. Ran on both large and normal BERT. Any ideas why?

How to run Predictions after Fine-tuning.

Dear kamalkraj,
This is to first and foremost thank you for your appreciated efforts making this possible. I have a question that reads as follows:
I have fine-tuned a BERT NER model after converting a previously pre-trained model using tf1_to_keras_checkpoint_converter.y followed by tf2_checkpoint_converter.py. [ https://github.com/tensorflow/models/releases/tag/v2.0 ]
The Question:
When I try to perform predictions on a new data (test) using run_ner.py, and omit the --do_train option, the program only performs predictions on the first 128 tokens. Is there a way I can perform predictions on a file that has more than 128 tokens?

Thank you so much in advance.

UnboundLocalError: local variable 'eval_features' referenced before assignment

Following through the readme and ran the file
python run_ner.py --data_dir=data/ --bert_model=bert-base-cased --output_dir=out_base --max_seq_length=128 --do_train --num_train_epochs 3 --do_eval --eval_on dev

I get this error. This is blocking,

Traceback (most recent call last):
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 512, in
main()
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 380, in main
np.asarray([f.label_id for f in eval_features],dtype=np.int32))
UnboundLocalError: local variable 'eval_features' referenced before assignment

ResourceExhaustedError

tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[128,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:StridedSliceGrad]

The API does not take very long texts.

When sending little bit longer text to API, it will respond:

Expected size[0] in [0, 512], but got 1144 [Op:Slice]

This means you need to split texts so that they are under 500 tokens long, which is not optimal.

Tensorflow serving Request

Hi Kamal, I very much like your work and model and making as public.
Just have a request

  1. Could you please add tenser flow serving on this model for deployment ?
  2. I ran against google colab and got error b2low
    RuntimeError: Error copying tensor to device: /job:localhost/replica:0/task:0/device:GPU:0. /job:localhost/replica:0/task:0/device:GPU:0 unknown device.

Did I do any mistake for colab setting or need to change the run_ner.py file?

Regards,
Niranjan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.