kartikgill / easter2 Goto Github PK

Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION

License: Apache License 2.0

Jupyter Notebook 68.85% Python 31.15%

handwriting-ocr handwriting-recognition handwritten-text-recognition htr iam-dataset ocr ocr-python optical-character-recognition python3 easter2

easter2's Introduction

👋 Hello, My name is Kartik Chaudhary

🇮🇳 I am a Machine Learning professional from India
🔷 I am currently working at Google as a Senior AI/ML engineer
🔭 I work on ML/AI based applications, research, MLOps, GenAI, and other AI tools.
🐳 I design AI/ML based large scale products/solutions
☎️ If you would like to connect with me 1:1, buzz me on Topmate
✏️ Checkout my Blog on AI- Drops of AI
📕 Checkout my Book on Vertex AI - The Definitive Guide to Google Vertex AI
📕 Checkout my Book on GANs - The GAN Book

📫 Connect!

🤝LinkedIn <---> 🐦Twitter <---> 📸Instagram

✏️ Recent Blog Posts

easter2's People

Contributors

Stargazers

Watchers

easter2's Issues

Altering EASTER2.0 to work on Word Level Cropping

Hi Kartik,
Thanks for this EASTER2.0 model. I have tried it with lots of data and it has given great results. I had a query regarding this, I am planning to alter this for predicting Handwritten Word cropping's. Do you reckon its a good idea? If yes what are the changes that needs to be done from the architecture side and how should the input and output shape be varied. I have good amount of word level handwritten training data (~200k) which are well preprocessed and cropped. I am hoping i can make use of this model to predict and get good results.

Thanks,
Nithin

IAM statistics in the paper

Hello Authors! Great work.

I just found the statistics provided in the arxiv pre-print a little bit confusing. While presenting the IAM dataset stats, it looked like there are 64 lac training images. The different commas used for decimal separation and for separation of two different numbers became apparent to me only when I want to check the statistics in the IAM website. Please re-phrase the sentence :

IAM A has about 6,482 lines for training , 976 for validation and 2,915 for testing. IAM-B has about 6,161 for training, 940 for validation and 1,861 for testing.

On first read, I computed 10% of IAM as 6 lac and was wondering how could that be called few-shot?

Any plans of releasing Pre- Trained Weights?

Hi,
Great work first of all. Had been looking for something like this last year. I've updated the leader board over Paperwswithcode. Please change in case you feel I made a mistake.

Also, any plans of releasing the pre-trained weights?

Test on individual images

Hi, can you please provide a notebook or code for testing on individual images?

Error Rate (Validation + Test)

Hi. I have used your checkpoint hdf5, change the path according to config.py and done the changes ('tensorflow':tf, 'K':K), but I am not able to achieve the accuracy that you have got. Can you please look at this once.

CER on Validation Set

CER or Test Set

Can't find Link of Model "saved_checkpoint.hdf5"

Hey @kartikgill
Can you please provide the link of model "saved_checkpoint.hdf5"

the demo and pretrained checkpoint

could please show the demo of infering one image and give us the link of the pretrained checkpoint??

How to select the super parameters for data argument, especially T_ Max

Your work is of great application value. Vertical tiles can improve the performance. But how to select the super parameters for data argument, especially T_ Max, i think this should be related to the height of the text line. can you give me some suggestions.

Weights for model without TACo augmentations

Can you also share the weights of the model without any augmentation. It would be nice if you share weights of other models with different augmentations as given in table 4 of your paper.

Changing Input Image size

I am trying to reduce the image size from (2000, 80) [default] to (1000, 80) and have set LONG_LINES augmentation to False.
I am getting the following error.

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: sequence_length(0) <= 250
[[node functional_3/ctc/CTCLoss (defined at easter_model.py:33) ]]
(1) Invalid argument: sequence_length(0) <= 250
[[node functional_3/ctc/CTCLoss (defined at easter_model.py:33) ]]
[[functional_3/ctc/CTCLoss/_154]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_11028]

Do I need to change something in the model to make it work? Please help. TIA

Can't load checkpoint: bad marshal data

Context:
I have followed the instructions in the Readme file (download the latest checkpoint and placed it in the correct folder)
I tried to test the model on the validation set.

Error:
"Unable to Load Checkpoint."
if I print the exception at that except statement I get the following error bad marshal data

Cause:
I think this problem is caused by the different python / tensorflow versions used to generate the model and in my runtime.
My versions are:

python 3.10.4 / python 3.8
tensorflow 2.9.2
keras 2.9.0

Solution:
I sovled the problem by running tensroflow 2.9.0 with python 3.7 in a docker container and saving the model as a "tensorflow saved model" Format instead of keras H5 Format. Now it works with the newer version of python.

Suggestion:
Add a tensorflow saved model format checkpoint to support more python/tensorflow versions.

PS: there was another problem while loading the model which is a missing 'k': K in the custom_objects={...}

The question about weighted_CTCloss

def ctc_custom(args):
"""
custom CTC loss
"""
y_pred, labels, input_length, label_length = args
ctc_loss = K.ctc_batch_cost(
labels,
y_pred,
input_length,
label_length
)
p = tensorflow.exp(-ctc_loss)
gamma = 0.5
alpha=0.25
return alpha*(K.pow((1-p),gamma))*ctc_loss

how to understand the codes

SystemError: Exception encountered when calling layer "ctc" (type Lambda). unknown opcode

Hello,

I am trying to train the model with only 10 epochs. To test, I am using the last checkpoint saved in the weights folder during the training process. When I run test_on_iam(checkpoint="../weights/EASTER2--10--6.52.hdf5"), it says I have a SystemError from an unknown opcode.

Upon investigation, a lot of errors trace back to the python versions used to train and test the model. However, I am sure that I trained the model with same version I am using to test it. I have also read that it could be caused by the use of the lambda layer in the model. However, there aren't any apparent issues with the lambda layer.

Additionally, when I try to use your saved_checkpoint.hdf5, I also get an error when trying to load the model: ValueError: bad marshal data (unknown type code)

have you encountered this?

saved_checkpoint.hdf5 throwing keyerror

Hi Kartik,

When i' m trying to load the saved_checkpoint.hdf5 file, i'm getting below error looks like it is corrupted.

line of code : test_on_iam(show=True, partition="validation", checkpoint=checkpoint_path, uncased=True)

Error : from load_easter_model(checkpoint_path) function
KeyError: 'Unable to open object (bad heap free list)'

EOF read where object expected

I tried running predict_line.py on command prompt

python predict_line.py --path C:/Handwritten_OCR/HTRPipeline-master/HTRPipeline-master/data1/eng_AF_004.jpg

but I am getting this error

EOF read where object expected
Unable to Load Checkpoint.
Traceback (most recent call last):
  File "C:\Handwritten_OCR\Easter2-main\Easter2-main\src\predict_line.py", line 72, in <module>
    print(infer_obj.predict(args.path)) ## change the image path with the file path you want
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Handwritten_OCR\Easter2-main\Easter2-main\src\predict_line.py", line 57, in predict
    output = self.model.predict(img)
             ^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'predict'

I would appreciate if anyone can help.

text extraction using ocr

If i give a cropped image with a handwritten text in a line ,it is difficult to recognize and i'm getting wrong predicted output.
instead if i give the same without the text touching the line ,then i could extract the text without any mistake.
input:

output:

so,let me know how could it be resolved!

Unable to run inference.

I am trying to run prediction on IAM dataset. Getting an error while loading saved_checkpoints from release section.The error is unable to load checkpoints.