Coder Social home page Coder Social logo

tensorflow-ocr's Introduction

tensorflow-ocr

🖺 OCR using tensorflow with attention, batteries included

Installation

git clone --recursive http://github.com/pannous/tensorflow-ocr
# sudo apt install python3-pip
cd tensorflow-ocr
pip install -r requirements.txt

Evaluation

You can detect the text under your mouse pointer with mouse_prediction.py

it takes 10 seconds to load the network and startup, then it should return multiple results per second .

text_recognizer.py

To combine our approach with real world images we forked the EAST boundary boxing.

Customized training

To get started with a minimal example similar to the famous MNIST try ./train_letters.py ; It automatically generates letters for all different font types from your computer in all different shapes and trains on it.

For the full model used in the demo start ./train.py

tensorflow-ocr's People

Contributors

imgbotapp avatar loongliu avatar pannous avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tensorflow-ocr's Issues

ImportError: No module named layer

lack layer.py


localhost:tensorflow-ocr didi$ ./train_ocr_layer.py
Traceback (most recent call last):
File "./train_ocr_layer.py", line 3, in
import layer
ImportError: No module named layer
localhost:tensorflow-ocr didi$ pip install layer
Collecting layer
Could not find a version that satisfies the requirement layer (from versions: )
No matching distribution found for layer

test accuracy so low

Hi,

After running all 500000 steps, the test accuracy is only 0.4375, is it right?

image

beginning to do research on OCR using tensorflow

Dear sir,

I just start learning about Neural Network and I found Tensorflow is easy to use.

I learn tensorflow by running example from this tutorial https://github.com/nlintz/TensorFlow-Tutorials and reading this book deeplearningbook.org.

My goal is use Neural network to recognize each letters and number (eg: a,b...x,y,z, A..Z..0...9) . So It means 62 class.

I can segment each letter from the picture by QT.

Now,I confusing about how to OCR by using tensorflow.
Cause there are a lot of project about "Handwriting recognition" but a few about "OCR for printed documents". And I a newbie so I really don't know how or where to begin.

Would you mind helping me.

1/ Can I use the same those MNIST classifiers (http://yann.lecun.com/exdb/mnist/) for recognizing English alphabet/letters ? Cause MNIST classifiers recognize digit (0 - 9) so I really don't know It could handle 62 class or not ?

2/ Do you have any documents or papers about "OCR for printed documents". I really lack of knowledge.

3/ I found your project tensorflow-OCR . Would you mind if I learn from your code.

4/ I got errors when i run : ./train_ocr_layer.py

(tensorflow)khoa@khoa:~/tensorflow/tensorflow-ocr$ ./train_ocr_layer.py 
ls: cannot access /tmp/tensorboard_logs/: No such file or directory
Traceback (most recent call last):
  File "./train_ocr_layer.py", line 4, in <module>
    import layer
  File "/home/khoa/tensorflow/tensorflow-ocr/layer/__init__.py", line 1, in <module>
    from net import *
  File "/home/khoa/tensorflow/tensorflow-ocr/layer/net.py", line 8, in <module>
    set_tensorboard_run(auto_increment=True)
  File "/home/khoa/tensorflow/tensorflow-ocr/layer/tensorboard_util.py", line 18, in set_tensorboard_run
    run_nr = get_last_tensorboard_run_nr()
  File "/home/khoa/tensorflow/tensorflow-ocr/layer/tensorboard_util.py", line 8, in get_last_tensorboard_run_nr
    logs=subprocess.check_output(["ls", tensorboard_logs]).split("\n")
  File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['ls', '/tmp/tensorboard_logs/']' returned non-zero exit status 2

Thank you and regards,
Khoa

where is current_model.h5

Can you please share your architecture model. Im trying text_recognizer, but it needs the current_model.h5 which is not available.

Thanks.

OSError: Unable to open file (file signature not found)

When i try to run mouse_prediction.py i get the following error message:

\AppData\Local\Programs\Python\Python37\lib\site-packages\h5py\_hl\files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (file signature not found)

where is the data

I am running train_letter.py and return this "No such file or directory: 'fonts.list'"

text.py

where is file?
in text.py 13 lines "open (word_file).read().splitlines()"
no file '/usr/share/dict/words'

License

Hi,

What's the license of this project?

train_letters.py not running

Traceback (most recent call last):
File "train_letters.py", line 61, in
net.train(data=data, dropout=.6, display_step=10, test_step=1000) # run resume
File "/usr/local/lib/python3.5/dist-packages/layer/net.py", line 438, in train
tf.add_check_numerics_ops()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/numerics.py", line 90, in add_check_numerics_ops
raise ValueError("tf.add_check_numerics_ops() is not compatible " ValueError: tf.add_check_numerics_ops() is not compatible with TensorFlow control flow operations such as tf.cond() or tf.while_loop().

why accuracy and test accuray are so low?

I installed tensorflow 0.11 and PIL 1.17 under ubuntu 16.04, but when i execute the command:
python train_ocr_layer.py
the result of accuracy and test accuray are below:
qq 20161129150742
I don't know the reason,would help me?@pannous

transfer denseConv ckpt to pb file error

What did I do :
python train_ocr_layer.py
After I got ckpt file, I try to transfer it into pb file with code:

import tensorflow as tf
from tensorflow.python.framework import graph_util

def freeze_graph(input_checkpoint, output_graph):
    '''
    :param input_checkpoint:
    :param output_graph: PB模型保存路径
    :return:
    '''
    model_folder = './checkpoints/'
    checkpoint = tf.train.get_checkpoint_state(model_folder) #检查目录下ckpt文件状态是否可用
    input_checkpoint = checkpoint.model_checkpoint_path #得ckpt文件路径

    # 指定输出的节点名称,该节点名称必须是原模型中存在的节点
    output_node_names = "group_deps"
    saver = tf.train.import_meta_graph(input_checkpoint + '.meta', clear_devices=True)
    graph = tf.get_default_graph()  # 获得默认的图
    input_graph_def = graph.as_graph_def()  # 返回一个序列化的图代表当前的图

    with tf.Session() as sess:
        saver.restore(sess, input_checkpoint)  # 恢复图并得到数据
        output_graph_def = graph_util.convert_variables_to_constants(  # 模型持久化,将变量值固定
            sess=sess,
            input_graph_def=input_graph_def,  # 等于:sess.graph_def
            output_node_names=output_node_names.split(","))  # 如果有多个输出节点,以逗号隔开

        with tf.gfile.GFile(output_graph, "wb") as f:  # 保存模型
            f.write(output_graph_def.SerializeToString())  # 序列化输出
        print("%d ops in the final graph." % len(output_graph_def.node))  # 得到当前图有几个操作节点

        for op in graph.get_operations():
             print(op.name, op.values())

if __name__ == '__main__':

    input_checkpoint = './checkpoints/denseConv0.ckpt'
    output_graph = './checkpoints/frozen_graph.pb'
    freeze_graph(input_checkpoint, output_graph)

After I got pb file, I just can`t open it with tensorborad or transfer it into ONNX.
I got this error message:

Traceback (most recent call last):
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 418, in import_graph_def
    graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node import/model/batchnorm/AssignMovingAvg was passed float from import/model/batchnorm//moving_mean:0 incompatible with expected float_ref.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cenhong/.local/bin/tfpb_tensorboard", line 11, in <module>
    load_entry_point('doml', 'console_scripts', 'tfpb_tensorboard')()
  File "/home/cenhong/do-ml/scripts/tfpb_tensorboard.py", line 33, in main
    tfpb_tensorboard(args.input_path, args.log_path, 6006 if args.port is None else args.port)
  File "/home/cenhong/do-ml/scripts/tfpb_tensorboard.py", line 18, in tfpb_tensorboard
    g_in = tf.import_graph_def(graph_def)
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 422, in import_graph_def
    raise ValueError(str(e))
ValueError: Input 0 of node import/model/batchnorm/AssignMovingAvg was passed float from import/model/batchnorm//moving_mean:0 incompatible with expected float_ref.

Does anyone know how to solve this? Thanks.

where is training data?

@pannous ,when i train the tensorflow-ocr, I can't find the training data the system input. and when i type the ./train_ocr_layer, there are throwing an error: Exception: BAD FONT: /usr/share/texlive/texmf-dist/fonts/truetype/public/dejavu/DejaVuSansMono-Oblique.ttf.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.