Coder Social home page Coder Social logo

banti_telugu_ocr's People

Contributors

chillaranand avatar chprasad avatar rakeshvar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

banti_telugu_ocr's Issues

how to use your serial project to realize a ocr for chinese

i have a lot of xps/pdf file which can transform to jpeg files,
1.do i need to generate millions of chinese characters like your " datagen_initio "
2.what about font and encoding for chinese Character "Mallicodes"
3.do i need to prepare box files generated by antanci_segmenter /OCR Segmenter

RecursionError occurs while processing

  File "/home/chillaranand/projects/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/ocr/banti_telugu_ocr/banti/linegraph.py", line 43, in process_node
    logd("Processing in {}".format(idx))
  File "/usr/lib/python3.5/logging/__init__.py", line 1266, in debug
    if self.isEnabledFor(DEBUG):
  File "/usr/lib/python3.5/logging/__init__.py", line 1519, in isEnabledFor
    if self.manager.disable >= level:
RecursionError: maximum recursion depth exceeded in comparison

Facing issues

Hello, I'm seeing the following error. Could you please help me.

Screen Shot 2022-01-19 at 4 45 49 PM

sample

When try to run get an error

Dear Team,
I installed your system and try to run using command
$ python3 recognize.py sample_images/praasa.tif
I got error as

Command line Arguments
calibration 1
input_file_or_dir sample_images/praasa.tif
labels_fname labellings/alphacodes.lbl
log_level 20
ngram_fname library/mega.123.pkl
nnet_fname library/nn.pkl
scaler_fname scalings/relative48.scl

Traceback (most recent call last):
File "recognize.py", line 121, in
from banti.ocr import OCR
File "/home/insight/Downloads/banti_telugu_ocr-master/banti/ocr.py", line 6, in
from .classifier import Classifier
File "/home/insight/Downloads/banti_telugu_ocr-master/banti/classifier.py", line 5, in
from theanet.neuralnet import NeuralNet
ImportError: No module named 'theanet'

Please advise What I do to avoid and run the system smooth...

Waiting reply

Thanks
Anes

Errors: When combining two glyphs

banti was working well with Pillow==3.1.1. In another system, that pillow version is not getting installed. So I have installed latest version Pillow==3.4.2. Now it throws this error with given sample file.

chillaranand@pavilion:~/projects/python/ocr/banti_telugu_ocr on git:master o |
→ python recognize.py sample_images/praasa.tif
Command line Arguments
        calibration         1
        input_file_or_dir   sample_images/praasa.tif
        labels_fname        labellings/alphacodes.lbl
        log_level           20
        ngram_fname         library/mega.123.pkl
        nnet_fname          library/nn.pkl
        scaler_fname        scalings/relative48.scl

/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved t
o the theano.tensor.signal.pool module.
  "downsample module has been moved to the theano.tensor.signal.pool module.")
Initializing the OCR
Compiling full test function...
         OCR initialized.
************************************************************
PROCESSING sample_images/praasa.tif
Classifing glyphs...
/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/page.py:341: VisibleDeprecationWarning: using a non-integer number instead of an intege
r will result in an error in the future
  horz_buffer = np.zeros((self.ht, brick_wd))
/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/numpy/core/numeric.py:190: VisibleDeprecationWarning: using a non-integer number inst
ead of an integer will result in an error in the future
  a = empty(shape, dtype, order)
/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/page.py:352: VisibleDeprecationWarning: using a non-integer number instead of an intege
r will result in an error in the future
  self.word_closed_arr = self.word_closed_arr[:, brick_wd:-brick_wd]
Finding most likely sentences...
Line  0
Traceback (most recent call last):
  File "recognize.py", line 154, in <module>
    ocr_pattern(args.input_file_or_dir)
  File "recognize.py", line 149, in ocr_pattern
    recognizer.ocr_file(inpt)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/ocr.py", line 63, in ocr_file
    gramgraph.process_tree()
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 83, in process_tree
    self.process_node(0)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 58, in process_node
    do_combine, new_wt = chld_wt.combine(gc_wt)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/proglyph.py", line 124, in combine
    combined = self + other
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 128, in __add__
    summ = self.__class__()
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/proglyph.py", line 100, in __init__
    super().__init__(info)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 29, in __init__
    self.init_from_box_6pack_list(['', 0, 0, 0, 0, 0, 0, 0, 0, None])
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 64, in init_from_box_6pack_list
    self.pix_from_sixpack()
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 100, in pix_from_sixpack
    self.set_pix(pix)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 104, in set_pix
    self.img = im.fromarray(255 * (1 - self.pix))
  File "/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/PIL/Image.py", line 2187, in fromarray
    return frombuffer(mode, size, obj, "raw", rawmode, 0, 1)
  File "/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/PIL/Image.py", line 2114, in frombuffer
    _check_size(size)
  File "/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/PIL/Image.py", line 2001, in _check_size
    raise ValueError("Width and Height must be > 0")
ValueError: Width and Height must be > 0

Segmentation fault with banti segmenter

OCR is working very well with given sample data.

I tried to convert a test image and segmenter is throwing segmentation fault.

$ ./banti_segmenter ../ocr/biddala/0-007.converted.tif 
[1]    1695 segmentation fault (core dumped)  ./banti_segmenter ../ocr/biddala/0-007.converted.tif

Full trace:

# anand at anand in ~/projects/python/banti_telugu_ocr on git:master o [16:06:48]                                               [25/485]
$ python3  recognize.py ../ocr/biddala/0-007.png 
Command line Arguments
        banti_segmenter     ./banti_segmenter
        calibration         1
        input_file_or_dir   ../ocr/biddala/0-007.png
        labels_fname        labellings/alphacodes.lbl
        log_level           20
        ngram_fname         library/mega.123.pkl
        nnet_fname          library/nn.pkl
        scaler_fname        scalings/relative48.scl

Initializing the OCR
Compiling full test function...
Done
Launched command with timeout=10
"convert -units PixelsPerInch ../ocr/biddala/0-007.png -compress Group4 -depth 1 -resample 400 ../ocr/biddala/0-007.converted.tif"
Success
STDOUT:

STDERR:

Launched command with timeout=10
"./banti_segmenter ../ocr/biddala/0-007.converted.tif"
Success
STDOUT:

STDERR:

Launched command with timeout=10
"./banti_segmenter ../ocr/biddala/0-007.converted.tif"
Success
STDOUT:

STDERR:

Traceback (most recent call last):
  File "recognize.py", line 229, in <module>
    recognizer.ocr_box_file(box_fname)
  File "/home/anand/projects/python/banti_telugu_ocr/ocr.py", line 48, in ocr_box_file
    bf = BantryFile(box_fname)
  File "/home/anand/projects/python/banti_telugu_ocr/bantry.py", line 155, in __init__
    in_file = open(name)
FileNotFoundError: [Errno 2] No such file or directory: '../ocr/biddala/0-007.converted.box'

Image used

0-007

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.