The pero-ocr's discuss from dcgm

training model

Hello again, just wondering where I can find the code that can be used to train a handwritten text recognition model.
I only find in this repository code which can be used to score an existing image, not for training a model.

Music pull request feedback

@vlachvojta :

@vlachvojta with @ikiss-fit :

Check API and web compatibility (after adding line confidence in the OCR engine)

layout detection not good on exercise books

The layout detector identifies a lot of small regions instead of few larger ones.

Do not mix lines when exporting txt

When exporting txt (at least txt), export the lines in the order they appear on page or in column to column order. Currently, they are sometimes ordered rather strangely. See https://pero-ocr.fit.vutbr.cz/ocr/show_results/2269d1ae-d61c-4129-9c7c-3c78bc81cd8b

Add support for mixed handwritten and printed text

... as well as combined with gothic etc.

Rename "Download Pages" to "Download transcriptions"

Failed line cropping in page_parser

Line crop fails. Job saved at /mnt/matylda1/hradis/PERO/BUGS/a9ccd42b-9b26-40ae-9c3b-6e4d26c21ee0

Processing 4/24 (16.67 %) [id: b0a89e97-5c8a-4511-94db-7fed583bcba9]
Traceback (most recent call last):
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/user_scripts/parse_folder.py", line 172, in
main()
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/user_scripts/parse_folder.py", line 150, in main
page_layout = page_parser.process_page(image, page_layout)
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/pero_ocr/document_ocr/page_parser.py", line 256, in process_page
page_layout = self.line_cropper.process_page(image, page_layout)
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/pero_ocr/document_ocr/page_parser.py", line 201, in process_page
line.crop = self.crop_engine.crop(img, line.baseline, line.heights)
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/pero_ocr/document_ocr/crop_engine.py", line 70, in crop
line_crop = cv2.remap(img_crop, coords[:, :, 0], coords[:, :, 1], interpolation=cv2.INTER_LINEAR, borderMode=cv2.BORDER_TRANSPARENT)
cv2.error: OpenCV(4.0.0) /io/opencv/modules/imgproc/src/imgwarp.cpp:666: error: (-215:Assertion failed) !ssize.empty() in function 'remapBilinear'

.

ALTO export BUG

Export fails when text line has no points?

For exmple document c1951833-8440-4851-93b5-6dfc6c3663bf, second page fe55b56c-341e-48d3-82ac-e3a971a0a124.

Error:
Aug 31 07:59:00 pero-ocr gunicorn[12175]: Traceback (most recent call last):
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/flask/app.py", line 2447, in wsgi_app
Aug 31 07:59:00 pero-ocr gunicorn[12175]: response = self.full_dispatch_request()
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/flask/app.py", line 1952, in full_dispatch_request
Aug 31 07:59:00 pero-ocr gunicorn[12175]: rv = self.handle_user_exception(e)
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/flask/app.py", line 1821, in handle_user_exception
Aug 31 07:59:00 pero-ocr gunicorn[12175]: reraise(exc_type, exc_value, tb)
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
Aug 31 07:59:00 pero-ocr gunicorn[12175]: raise value
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/flask/app.py", line 1950, in full_dispatch_request
Aug 31 07:59:00 pero-ocr gunicorn[12175]: rv = self.dispatch_request()
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/flask/app.py", line 1936, in dispatch_request
Aug 31 07:59:00 pero-ocr gunicorn[12175]: return self.view_functionsrule.endpoint
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/flask_login/utils.py", line 272, in decorated_view
Aug 31 07:59:00 pero-ocr gunicorn[12175]: return func(*args, **kwargs)
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/pero/pero_ocr_web/app/document/routes.py", line 185, in get_alto_xml
Aug 31 07:59:00 pero-ocr gunicorn[12175]: return create_string_response(filename, page_layout.to_altoxml_string(), minetype='text/xml')
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/pero/pero-ocr/pero_ocr/document_ocr/layout.py", line 335, in to_altoxml_string
Aug 31 07:59:00 pero-ocr gunicorn[12175]: string.set("HEIGHT", str(int((np.max(all_y) - np.min(all_y)))))
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "<array_function internals>", line 6, in amax
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2668, in amax
Aug 31 07:59:00 pero-ocr gunicorn[12175]: keepdims=keepdims, initial=initial, where=where)
Aug 31 07:59:00 pero-ocr gunicorn[12175]: File "/home/pero/env/pero-ocr/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
Aug 31 07:59:00 pero-ocr gunicorn[12175]: return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
Aug 31 07:59:00 pero-ocr gunicorn[12175]: ValueError: zero-size array to reduction operation maximum which has no identity

Page processing fail in line detection

Processing 20/25 (80.00 %) [id: 371eaaf3-a3e7-45c9-8410-0e0f9ac872da]
Traceback (most recent call last):
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/user_scripts/parse_folder.py", line 172, in
main()
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/user_scripts/parse_folder.py", line 150, in main
page_layout = page_parser.process_page(image, page_layout)
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/pero_ocr/document_ocr/page_parser.py", line 246, in process_page
page_layout = self.line_parser.process_page(image, page_layout)
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/pero_ocr/document_ocr/page_parser.py", line 129, in process_page
region = self.assign_lines_to_region(baseline_list, heights_list, textline_list, region)
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/pero_ocr/document_ocr/page_parser.py", line 115, in assign_lines_to_region
baseline_intersection, textline_intersection = linepp.mask_textline_by_region(baseline, textline, region.polygon)
File "/home/ihradis/projects/2018-01-15_PERO/pero-ocr-live/pero_ocr/line_engine/line_postprocessing.py", line 179, in mask_textline_by_region
baseline_is = region_shpl.intersection(baseline_shpl)
File "/home/ihradis/env/tf/lib/python3.6/site-packages/shapely/geometry/base.py", line 620, in intersection
return geom_factory(self.impl['intersection'](self, other))
File "/home/ihradis/env/tf/lib/python3.6/site-packages/shapely/topology.py", line 70, in call
self._check_topology(err, this, other)
File "/home/ihradis/env/tf/lib/python3.6/site-packages/shapely/topology.py", line 38, in _check_topology
self.fn.name, repr(geom)))
shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.polygon.Polygon object at 0x7f57dc052be0>

Transcription

For old latin transcription, which model should i select to generate the OCR of the below image please?

Problem with the pretrained model not available

File "/usr/local/lib/python3.9/dist-packages/torch/jit/_serialization.py", line 149, in load
raise ValueError(f"The provided filename {f} does not exist") # type: ignore[str-bytes-safe]
ValueError: The provided filename /opt/pero/pero-ocr/ocr_model/checkpoint_646000.ckpt does not exist

Can't install through pip

Hi, I'm trying to use this repository in a college project, but I'm can't seem to do pip install pero-ocr.

I'm getting the following error

The conflict is caused by:
    pero-ocr 0.5 depends on tensorflow-gpu==1.15
    pero-ocr 0.4 depends on tensorflow-gpu==1.15
    pero-ocr 0.3 depends on tensorflow-gpu==1.15
    pero-ocr 0.2 depends on tensorflow-gpu==1.14
    pero-ocr 0.1.1 depends on tensorflow-gpu==1.14

But when trying to install that version of tensorflow-gpu, I can't seem to get a valid version.

Thank you.

Website typo Layout Analysis

I suppose website related issues can also be mentioned here.

I noticed a typo for selecting the layout analysis.
Shouldn't Select baseline detector be Select layout detector?

Add support for vertical text

example: see "tabulka - text svisle" job

Add support for upside-down text

example: https://pero-ocr.fit.vutbr.cz/ocr/show_results/b7cd8304-aed4-4857-89a1-2410826478f3

FIX: Switch WIDTH and HIGHT in ALTO export

add support for gif format

where can we find the pretrained models?

are there checkpoints of models which can be downloaded available somewhere?

Page color does not switch to DONE if some lines were deleted.

Add region categories

Internal export: (pseudo PageXML)

All regions are RegionLayout with category attribute (saved to XML as TextRegion element with category in custom attribute)
Set OCR/OMR Engines to work only with some types of lines
Set Layout Engines to work only with some types of regions
Merging overlapping regions. (Text layout engine which detects region/line inside of other region, adds its lines the given region. Using geometry and coords to determine if some region/line is inside of some region) - not usefull feature

ALTO - word blocks are shifted

Line crop fails probably due empty mapping

Error log:
line_coords = self.get_crop_inputs(baseline, height, self.line_height)
Traceback (most recent call last):
File "/home/pero/PERO/pero-ocr/user_scripts/parse_folder.py", line 176, in main
page_layout = page_parser.process_page(image, page_layout)
File "/home/pero/PERO/pero-ocr/pero_ocr/document_ocr/page_parser.py", line 408, in process_page
page_layout = self.line_cropper.process_page(image, page_layout)
File "/home/pero/PERO/pero-ocr/pero_ocr/document_ocr/page_parser.py", line 348, in process_page
line.crop = self.crop_engine.crop(img, line.baseline, line.heights)
File "/home/pero/PERO/pero-ocr/pero_ocr/document_ocr/crop_engine.py", line 78, in crop
interpolation=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT)
cv2.error: OpenCV(4.2.0) /io/opencv/modules/imgproc/src/imgwarp.cpp:1703: error: (-215:Assertion failed) !_map1.empty() in function 'remap'

XML headers

As mentioned in issue #49, Pero generates ALTO files without proper XML headers (<?xml version='1.0' encoding='utf-8'?>). Was that intended, or could that be fixed?

problem of numpy version

Hello, when running the Integration of the pero-ocr python module, I encountered a problem with the numpy version, the error showed:

AttributeError: module 'numpy' has no attribute 'float'.
np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

If I want to lower the numpy version, scipy, numba, etc. also need to lower the version for compatibility, but many lower versions cannot be installed on my computer. What suggestions do you have? Thanks in advance!

PERO generates 0kB ALTO files

checking for zero ALTO file size would probably help

Getting KeyError

I was trying the pero-ocr on a png image with table and text but got the error below. Please, how do I resolve this?

.

Clustering layout probably fails on pages/regions with no lines?

Data in BUGS/a69eb9c4-ae17-4429-aa70-c636ee0051b0
log:
ERROR: Failed to process file 9d24471a-280b-4e2b-a175-d65910c7c548.
need at least one array to concatenate
Traceback (most recent call last):
File "/home/pero/PERO/pero-ocr/user_scripts/parse_folder.py", line 176, in main
page_layout = page_parser.process_page(image, page_layout)
File "/home/pero/PERO/pero-ocr/pero_ocr/document_ocr/page_parser.py", line 404, in process_page
page_layout = self.layout_parser.process_page(image, page_layout)
File "/home/pero/PERO/pero-ocr/pero_ocr/document_ocr/page_parser.py", line 141, in process_page
polygons_list, baselines_list, heights_list, textlines_list = self.region_engine.detect(img)
File "/home/pero/PERO/pero-ocr/pero_ocr/region_engine/region_engine_splic.py", line 65, in detect
region_poly_points = np.concatenate(region_textlines, axis=0)
File "<array_function internals>", line 6, in concatenate
ValueError: need at least one array to concatenate

Where does model for region detector place?

I run script with layout detection.
In the class EngineRegionDetector
It has error
Cannot interpret feed_dict key as Tensor: The name 'inference_input:0' refers to a Tensor which does not exist. The operation, 'inference_input', does not exist in the graph. in line 75

Out of memory in layout parsing

~/PERO/ocr_client_data/97a7fb57-93ed-4b02-a7d4-de313daca5bf/images/a0126ddf-ed27-4ab5-90d1-4257b1fa6d23.jpeg

OMR transformers produce nonsense transcriptions

Could be due to different input size

Test if OCR Transformers work.
Train OCR Transformer with different input size and test it.
Re-check network input.
If 2 works and 3 is not conclusive, re-train OMR models.

Layout analysis crashes

Crashed on two files in my new collection. Problem in live system.

Job ID: fb48773658124afab23ac9854ea5e56d
Document ID: 1e4d33dc189c4a2bb93eaebf722432e4
Image: 9823218f-12c1-4ede-ba68-897e055e5580
Errors:
Processing 9823218f-12c1-4ede-ba68-897e055e5580
ERROR: Failed to process file 9823218f-12c1-4ede-ba68-897e055e5580.
The operation 'GEOSUnion_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.polygon.Polygon object at 0x7f249c0cd050>

7Traceback (most recent call last):
File "/home/pero/pero/pero-ocr/user_scripts/parse_folder.py", line 205, in main
page_layout = page_parser.process_page(image, page_layout)
File "/home/pero/pero/pero-ocr/pero_ocr/document_ocr/page_parser.py", line 372, in process_page
page_layout = layout_parser.process_page(image, page_layout)
File "/home/pero/pero/pero-ocr/pero_ocr/document_ocr/page_parser.py", line 169, in process_page
p_list, b_list, h_list, t_list = self.engine.detect(img, rot=rot)
File "/home/pero/pero/pero-ocr/pero_ocr/layout_engines/cnn_layout_engine.py", line 127, in detect
region_poly = helpers.region_from_textlines(region_textlines)
File "/home/pero/pero/pero-ocr/pero_ocr/layout_engines/layout_helpers.py", line 100, in region_from_textlines
region_poly = region_poly.union(textline_poly)
File "/home/pero/python_environment/pero_ocr_web_clients/lib/python3.7/site-packages/shapely/geometry/base.py", line 658, in union
return geom_factory(self.impl['union'](self, other))
File "/home/pero/python_environment/pero_ocr_web_clients/lib/python3.7/site-packages/shapely/topology.py", line 70, in call
self._check_topology(err, this, other)
File "/home/pero/python_environment/pero_ocr_web_clients/lib/python3.7/site-packages/shapely/topology.py", line 38, in _check_topology
self.fn.name, repr(geom)))
shapely.errors.TopologicalError: The operation 'GEOSUnion_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.polygon.Polygon object at 0x7f249c0cd050>
TopologyException: Input geom 1 is invalid: Self-intersection at or near point 2347.0777238895662 -44.069123013668701 at 2347.0777238895662 -44.069123013668701

dcgm / pero-ocr Goto Github PK

pero-ocr's Issues

Recommend Projects

Recommend Topics

Recommend Org