Comments (7)
@madisonmay You can also use kraken's segmentation which handily produces JSON of bounding boxes (https://github.com/mittagessen/kraken/blob/master/docs/advanced.rst#page-segmentation-and-script-detection). Ocropy has two PR (ocropus-archive/DUP-ocropy#281 and ocropus-archive/DUP-ocropy#283) which also provided such coordinates JSON.
from calamari.
- Unfortunately there are no updates on the FCN, yet
- CPU will be used by default for training and prediction if no valid GPU is available
from calamari.
We are still hoping that you release a text line detection and extraction implementation in calamari in the future.
Thank you.
from calamari.
@ChWick Is there any tool available at GitHub that you recommend using for training text-line detection and extraction?
from calamari.
@mrocr Maybe the line segmenter of T. Breuel https://github.com/NVlabs/ocroseg is what you are looking for.
from calamari.
The developer of Kraken has noted to me that ocroseg doesn't converge, even when using the same uw3 dataset that T. Breuel used.
Source of conversation: mittagessen/seg#4
from calamari.
Hey there @ChWick. The README states that you intend to eventually move the ocropy line segmentation step into calamari. Is there a branch where this work is in progress that I could potentially build off of?
from calamari.
Related Issues (20)
- Error when convert old trained model to latest version model HOT 1
- Got exception during training HOT 4
- calamari-ocr 2.2.2 on ubuntu 22.04 partial success, difficulty with GPU software
- Prediction from calamari trained .pb model HOT 5
- Issue while using the model and json HOT 8
- setup.py on Ubuntu20.04: tensorflow is wrong version HOT 7
- Model very sensitive on PNG input HOT 3
- calamari/1.0: hold Tensorflow and Protobuf dependencies HOT 6
- What is the accuracy on Chinese/Japanese text? HOT 2
- Attention layer
- "No training configuration" for code that should not have one HOT 5
- Downgrading of models is not supported (5 to 2). Please upgrade your Calamari instance (currently installed: 1.0.6) HOT 4
- UnknownArgumentError HOT 7
- Release confusion HOT 4
- calmari/1.0: Fix 1.0.x models for Python 3.11 HOT 11
- allow SpatialDropout for Conv layers
- use annotated baseline instead of CenterNormalizer.measure
- network topology at CNN-RNN interface
- please release v1.0.7 off calamari/1.0 HOT 3
- ValueError: A KerasTensor cannot be used as input to a TensorFlow function. HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from calamari.