Coder Social home page Coder Social logo

zhoukaii / crnn_ctc_ocr_tf Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bai-shang/crnn_ctc_ocr_tf

0.0 0.0 0.0 459 KB

Extremely simple implement for CRNN by Tensorflow

Home Page: https://zhuanlan.zhihu.com/p/43534801

Python 98.58% Shell 1.42%

crnn_ctc_ocr_tf's Introduction

crnn_ctc_ocr_tf

This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR.

https://arxiv.org/abs/1507.05717

More details for CRNN and CTC loss (in chinese): https://zhuanlan.zhihu.com/p/43534801

The crnn+seq2seq+attention ocr code can be found here bai-shang/crnn_seq2seq_ocr_pytorch

Dependencies

All dependencies should be installed are as follow:

  • Python3
  • tensorflow==1.15.0
  • opencv-python
  • numpy

Required packages can be installed with

pip3 install -r requirements.txt

Note: This code cannot run on the tensorflow2.0 since it's modified the 'tf.nn.ctc_loss' API.

Run demo

Asume your current work directory is "crnn_ctc_ocr_tf"๏ผš

cd path/to/your/crnn_ctc_ocr_tf/

Dowload pretrained model and extract it to your disc: GoogleDrive .

Export current work directory path into PYTHONPATH:

export PYTHONPATH=$PYTHONPATH:./

Run inference demo:

python3 tools/inference_crnn_ctc.py \
  --image_dir ./test_data/images/ --image_list ./test_data/image_list.txt \
  --model_dir /path/to/your/bs_synth90k_model/ 2>/dev/null

Result is:

Predict 1_AFTERSHAVE_1509.jpg image as: aftershave

1_AFTERSHAVE_1509.jpg

Predict 2_LARIAT_43420.jpg image as: lariat

2_LARIAT_43420

Train a new model

Data Preparation

  • Firstly you need download Synth90k datasets and extract it into a folder.

  • Secondly supply a txt file to specify the relative path to the image data dir and it's corresponding text label.

For example: image_list.txt

90kDICT32px/1/2/373_coley_14845.jpg coley
90kDICT32px/17/5/176_Nevadans_51437.jpg nevadans
  • Then you suppose to convert your dataset to tfrecord format can be done by
python3 tools/create_crnn_ctc_tfrecord.py \
  --image_dir path/to/90kDICT32px/ --anno_file path/to/image_list.txt --data_dir ./tfrecords/ \
  --validation_split_fraction 0.1

Note: make sure that images can be read from the path you specificed. For example:

path/to/90kDICT32px/1/2/373_coley_14845.jpg
path/to/90kDICT32px/17/5/176_Nevadans_51437.jpg
.......

All training images will be scaled into height 32pix and write to tfrecord file.
The dataset will be divided into train and validation set and you can change the parameter to control the ratio of them.

Otherwise you can use the dowload_synth90k_and_create_tfrecord.sh script automatically create tfrecord:

cd ./data
sh dowload_synth90k_and_create_tfrecord.sh

Train model

python3 tools/train_crnn_ctc.py --data_dir ./tfrecords/ --model_dir ./model/ --batch_size 32

After several times of iteration you can check the output in terminal as follow:

During my experiment the loss drops as follow:

Evaluate model

python3 tools/eval_crnn_ctc.py --data_dir ./tfrecords/ --model_dir ./model/ 2>/dev/null

crnn_ctc_ocr_tf's People

Contributors

bai-shang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.