pannous / tensorflow-ocr Goto Github PK

View Code? Open in Web Editor NEW

645.0 59.0 215.0 209.42 MB

🖺 OCR using tensorflow with attention

Python 100.00%

tensorflow ocr

tensorflow-ocr's Introduction

tensorflow-ocr

🖺 OCR using tensorflow with attention, batteries included

Installation

git clone --recursive http://github.com/pannous/tensorflow-ocr
# sudo apt install python3-pip
cd tensorflow-ocr
pip install -r requirements.txt

Evaluation

You can detect the text under your mouse pointer with mouse_prediction.py

it takes 10 seconds to load the network and startup, then it should return multiple results per second .

text_recognizer.py

To combine our approach with real world images we forked the EAST boundary boxing.

Customized training

To get started with a minimal example similar to the famous MNIST try ./train_letters.py ; It automatically generates letters for all different font types from your computer in all different shapes and trains on it.

For the full model used in the demo start ./train.py

tensorflow-ocr's People

Contributors

Stargazers

Watchers

Forkers

prhmma mindis rremani 460130107 xiaolongmeng lipond bin2000 baotong chagge humanely jaechoon2 lwllovejj fx-cc rahulmirdha jeffreynghm darrynsmith-git fengqian1989 hzy-zg rongyousu maxarus xhuvom obinsc fireae lhjbusi lumiqai luanalabs realzheng templeblock benjamesbabala wanjinchang sunjieee allensmile szdree robingong kongdemingwang think-station eric013 bszollosinagy amitshah fresty davidsonggithub zjucsxxd sasund boluoyu zhangxinnan zhyj3038 yaningx ghhong1986 pustar pereira-cit lasith-niro raghavendranpm chaitusvk practise2017 faisal-w jcwen natminyel jangoai yingning jvsriram98 zgsxwsdxg loongliu amano-ginji giribushan zouwen198317 hemadri rosssong tony32769 liangsi03 ml-lab lookfuyao ntj28 lageek yan4821567 qwzhong1988 caitlindong littlefoxhome nrvnujd onebaicai ccdump runngezhang shifamulla allenk hbaslan tuandao2511 evifree yaokeepmoving horaccefeng ghjan ankitnamdeo34 tspannhw vgovindarajulu o7s8r6 leonzhouh luomor lihaolh tonsquemike ihankyang stanxii changss

tensorflow-ocr's Issues

ImportError: No module named layer

lack layer.py

localhost:tensorflow-ocr didi$ ./train_ocr_layer.py
Traceback (most recent call last):
File "./train_ocr_layer.py", line 3, in
import layer
ImportError: No module named layer
localhost:tensorflow-ocr didi$ pip install layer
Collecting layer
Could not find a version that satisfies the requirement layer (from versions: )
No matching distribution found for layer

test accuracy so low

Hi,

After running all 500000 steps, the test accuracy is only 0.4375, is it right?

beginning to do research on OCR using tensorflow

Dear sir,

I just start learning about Neural Network and I found Tensorflow is easy to use.

I learn tensorflow by running example from this tutorial https://github.com/nlintz/TensorFlow-Tutorials and reading this book deeplearningbook.org.

My goal is use Neural network to recognize each letters and number (eg: a,b...x,y,z, A..Z..0...9) . So It means 62 class.

I can segment each letter from the picture by QT.

Now,I confusing about how to OCR by using tensorflow.
Cause there are a lot of project about "Handwriting recognition" but a few about "OCR for printed documents". And I a newbie so I really don't know how or where to begin.

Would you mind helping me.

1/ Can I use the same those MNIST classifiers (http://yann.lecun.com/exdb/mnist/) for recognizing English alphabet/letters ? Cause MNIST classifiers recognize digit (0 - 9) so I really don't know It could handle 62 class or not ?

2/ Do you have any documents or papers about "OCR for printed documents". I really lack of knowledge.

3/ I found your project tensorflow-OCR . Would you mind if I learn from your code.

4/ I got errors when i run : ./train_ocr_layer.py

(tensorflow)khoa@khoa:~/tensorflow/tensorflow-ocr$ ./train_ocr_layer.py 
ls: cannot access /tmp/tensorboard_logs/: No such file or directory
Traceback (most recent call last):
  File "./train_ocr_layer.py", line 4, in <module>
    import layer
  File "/home/khoa/tensorflow/tensorflow-ocr/layer/__init__.py", line 1, in <module>
    from net import *
  File "/home/khoa/tensorflow/tensorflow-ocr/layer/net.py", line 8, in <module>
    set_tensorboard_run(auto_increment=True)
  File "/home/khoa/tensorflow/tensorflow-ocr/layer/tensorboard_util.py", line 18, in set_tensorboard_run
    run_nr = get_last_tensorboard_run_nr()
  File "/home/khoa/tensorflow/tensorflow-ocr/layer/tensorboard_util.py", line 8, in get_last_tensorboard_run_nr
    logs=subprocess.check_output(["ls", tensorboard_logs]).split("\n")
  File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['ls', '/tmp/tensorboard_logs/']' returned non-zero exit status 2

Thank you and regards,
Khoa

where is current_model.h5

Can you please share your architecture model. Im trying text_recognizer, but it needs the current_model.h5 which is not available.

Thanks.

OSError: Unable to open file (file signature not found)

When i try to run mouse_prediction.py i get the following error message:

\AppData\Local\Programs\Python\Python37\lib\site-packages\h5py\_hl\files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (file signature not found)

where is the data

I am running train_letter.py and return this "No such file or directory: 'fonts.list'"

Does this project have any document to refer to?

I can't find any document or tutorial, and have no clue on how to start with it. Do you guys have any material or wiki page to explain this project?

Thanks very much!

text.py

where is file?
in text.py 13 lines "open (word_file).read().splitlines()"
no file '/usr/share/dict/words'

License

Hi,

What's the license of this project?

train_letters.py not running

Traceback (most recent call last):
File "train_letters.py", line 61, in
net.train(data=data, dropout=.6, display_step=10, test_step=1000) # run resume
File "/usr/local/lib/python3.5/dist-packages/layer/net.py", line 438, in train
tf.add_check_numerics_ops()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/numerics.py", line 90, in add_check_numerics_ops
raise ValueError("tf.add_check_numerics_ops() is not compatible " ValueError: tf.add_check_numerics_ops() is not compatible with TensorFlow control flow operations such as tf.cond() or tf.while_loop().

why accuracy and test accuray are so low?

I installed tensorflow 0.11 and PIL 1.17 under ubuntu 16.04, but when i execute the command:
python train_ocr_layer.py
the result of accuracy and test accuray are below:

I don't know the reason,would help me?@pannous

transfer denseConv ckpt to pb file error

What did I do :
python train_ocr_layer.py
After I got ckpt file, I try to transfer it into pb file with code:

import tensorflow as tf
from tensorflow.python.framework import graph_util

def freeze_graph(input_checkpoint, output_graph):
    '''
    :param input_checkpoint:
    :param output_graph: PB模型保存路径
    :return:
    '''
    model_folder = './checkpoints/'
    checkpoint = tf.train.get_checkpoint_state(model_folder) #检查目录下ckpt文件状态是否可用
    input_checkpoint = checkpoint.model_checkpoint_path #得ckpt文件路径

    # 指定输出的节点名称,该节点名称必须是原模型中存在的节点
    output_node_names = "group_deps"
    saver = tf.train.import_meta_graph(input_checkpoint + '.meta', clear_devices=True)
    graph = tf.get_default_graph()  # 获得默认的图
    input_graph_def = graph.as_graph_def()  # 返回一个序列化的图代表当前的图

    with tf.Session() as sess:
        saver.restore(sess, input_checkpoint)  # 恢复图并得到数据
        output_graph_def = graph_util.convert_variables_to_constants(  # 模型持久化，将变量值固定
            sess=sess,
            input_graph_def=input_graph_def,  # 等于:sess.graph_def
            output_node_names=output_node_names.split(","))  # 如果有多个输出节点，以逗号隔开

        with tf.gfile.GFile(output_graph, "wb") as f:  # 保存模型
            f.write(output_graph_def.SerializeToString())  # 序列化输出
        print("%d ops in the final graph." % len(output_graph_def.node))  # 得到当前图有几个操作节点

        for op in graph.get_operations():
             print(op.name, op.values())

if __name__ == '__main__':

    input_checkpoint = './checkpoints/denseConv0.ckpt'
    output_graph = './checkpoints/frozen_graph.pb'
    freeze_graph(input_checkpoint, output_graph)

After I got pb file, I just can`t open it with tensorborad or transfer it into ONNX.
I got this error message:

Traceback (most recent call last):
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 418, in import_graph_def
    graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node import/model/batchnorm/AssignMovingAvg was passed float from import/model/batchnorm//moving_mean:0 incompatible with expected float_ref.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cenhong/.local/bin/tfpb_tensorboard", line 11, in <module>
    load_entry_point('doml', 'console_scripts', 'tfpb_tensorboard')()
  File "/home/cenhong/do-ml/scripts/tfpb_tensorboard.py", line 33, in main
    tfpb_tensorboard(args.input_path, args.log_path, 6006 if args.port is None else args.port)
  File "/home/cenhong/do-ml/scripts/tfpb_tensorboard.py", line 18, in tfpb_tensorboard
    g_in = tf.import_graph_def(graph_def)
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 422, in import_graph_def
    raise ValueError(str(e))
ValueError: Input 0 of node import/model/batchnorm/AssignMovingAvg was passed float from import/model/batchnorm//moving_mean:0 incompatible with expected float_ref.

Does anyone know how to solve this? Thanks.

where is training data?

@pannous ,when i train the tensorflow-ocr, I can't find the training data the system input. and when i type the ./train_ocr_layer, there are throwing an error: Exception: BAD FONT: /usr/share/texlive/texmf-dist/fonts/truetype/public/dejavu/DejaVuSansMono-Oblique.ttf.