kamalkraj / bert-ner-tf Goto Github PK
View Code? Open in Web Editor NEWNamed Entity Recognition with BERT using TensorFlow 2.0
License: Apache License 2.0
Named Entity Recognition with BERT using TensorFlow 2.0
License: Apache License 2.0
Hi Kamal,
Is it possible to load Albert Tf1.0 model found here - https://github.com/google-research/google-research/tree/master/albert
for NER training.
Hello Kamal,
Thank you for contributing this code. I am wondering if it is possible to train, or predict with sequence length = 512.?.
In my point of view! This package shouldn't contain an HTTP API because you are using a very opinionated framework to expose your model via an API. Let the users decide what they want to use as an HTTP Framework.
This HTTP API isn't configurable for example Ner("out_base") parameter, exposed route, "CORS", HTTP port, response payload, and request object can be changed without package changes.
For example, I don't want to use Flask as Framework and to install any dependencies regardings Flask so, please modify the python requirements and get rid of this api.py.
Hello,
I am using Tensorflow 2.9.1
Please help me to resolve this issue
File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 513, in
main()
File "C:\BERT-NER-TF\BERT-NER-TF\run_ner.py", line 449, in main
ner.load_weights(os.path.join(args.output_dir,"model.h5"))
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\AI\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\saving\hdf5_format.py", line 728, in load_weights_from_hdf5_group
raise ValueError(
ValueError: Layer count mismatch when loading weights from file. Model expected 2 layers, found 1 saved layers.
Why did the author choose n+1 labels instead of n labels?
In computer vision i understand that the model will be trained for n+1 classes (extra class is considered as background) but the no of labels will be only n. Is it the same criteria followed here or any other reason?
Thank you so much for your effort,
How to concatenate the response values?
For example:
[
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "jobs"
},
{
"confidence": 0.9981840252876282,
"tag": "B-PER",
"word": "Steve"
},
{
"confidence": 0.9998939037322998,
"tag": "B-PER",
"word": "Wozniak"
},
{
"confidence": 0.9998939037322998,
"tag": "O",
"word": "from"
},
{
"confidence": 0.999891996383667,
"tag": "B-LOC",
"word": "United"
},
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "States"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "Of"
}
{
"confidence": 0.9991968274116516,
"tag": "B-LOC",
"word": "America"
}
]
To This
{"B-PER": "Steve Jobs", "Steve Wozniak", "B-LOC": "United States Of America"}
Thanks
While restoring the pretrained Model in tensorflow==2.2.0
checkpoint.restore(init checkpoint).assert_existing_objects_matched() through error in NER training
Hi @kamalkraj , Nice work , but one thing while using the api.py and curl to get the prediction i got error
float32 is not json serializable
.
In order to reproduce the conll score reported in BERT paper (92.4 bert-base and 92.8 bert-large) one trick is to apply a truecaser on article titles (all upper case sentences) as preprocessing step for conll train/dev/test. This can be simply done with the following method.
#https://github.com/daltonfury42/truecase
#pip install truecase
import truecase
import re
# original tokens
#['FULL', 'FEES', '1.875', 'REOFFER', '99.32', 'SPREAD', '+20', 'BP']
def truecase_sentence(tokens):
word_lst = [(w, idx) for idx, w in enumerate(tokens) if all(c.isalpha() for c in w)]
lst = [w for w, _ in word_lst if re.match(r'\b[A-Z\.\-]+\b', w)]
if len(lst) and len(lst) == len(word_lst):
parts = truecase.get_true_case(' '.join(lst)).split()
# the trucaser have its own tokenization ...
# skip if the number of word dosen't match
if len(parts) != len(word_lst): return tokens
for (w, idx), nw in zip(word_lst, parts):
tokens[idx] = nw
# truecased tokens
#['Full', 'fees', '1.875', 'Reoffer', '99.32', 'spread', '+20', 'BP']
Also, i found useful to use : very small learning rate (5e-6) \ large batch size (128) \ high epoch num (>40).
With these configurations and preprocessing, I was able to reach 92.8 with bert-large.
Hey great work!
Question: Is it possible to run this project with and AMD GPU or at least run it with an CPU? For any help or hints, i would be glad.
Your code uses models with tf2 pre-trained checkpoints. However, tf2 checkpoints are only available in a few bert models: https://github.com/tensorflow/models/tree/master/official/nlp/bert
What about languages that are only available in tf1 bert checkpoints? Do you have any advice?
If i use the tf1 checkpoints, I would receive errors when loading the model.
init_checkpoint='the pretrained model checkpoint path.'
model=tf.keras.Model() # Bert pre-trained model as feature extractor.
checkpoint = tf.train.Checkpoint(model=model)
checkpoint.restore(init_checkpoint)
I have tried to train a model with the Multi Cased model tensorflow but there is an AssertionError when the Checkpoints are loaded. I was wondering if this code supports Multi-cased model.
this code is the fine tuning or use the bert do embedding?
I tested the model on different dates (Saturday, November 12, etc) and they were not classified as dates. Ran on both large and normal BERT. Any ideas why?
Dear kamalkraj,
This is to first and foremost thank you for your appreciated efforts making this possible. I have a question that reads as follows:
I have fine-tuned a BERT NER model after converting a previously pre-trained model using tf1_to_keras_checkpoint_converter.y followed by tf2_checkpoint_converter.py. [ https://github.com/tensorflow/models/releases/tag/v2.0 ]
The Question:
When I try to perform predictions on a new data (test) using run_ner.py, and omit the --do_train option, the program only performs predictions on the first 128 tokens. Is there a way I can perform predictions on a file that has more than 128 tokens?
Thank you so much in advance.
Following through the readme and ran the file
python run_ner.py --data_dir=data/ --bert_model=bert-base-cased --output_dir=out_base --max_seq_length=128 --do_train --num_train_epochs 3 --do_eval --eval_on dev
I get this error. This is blocking,
Traceback (most recent call last):
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 512, in
main()
File "C:\work\code\ai\BERT-NER-TF\run_ner.py", line 380, in main
np.asarray([f.label_id for f in eval_features],dtype=np.int32))
UnboundLocalError: local variable 'eval_features' referenced before assignment
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[128,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:StridedSliceGrad]
When sending little bit longer text to API, it will respond:
Expected size[0] in [0, 512], but got 1144 [Op:Slice]
This means you need to split texts so that they are under 500 tokens long, which is not optimal.
Hi Kamal, I very much like your work and model and making as public.
Just have a request
Did I do any mistake for colab setting or need to change the run_ner.py file?
Regards,
Niranjan
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.