Currently I am a programmer at Supremo Tribunal Federal.
- ๐ท Working on STF
- ๐ Programming: ๐ Python/Cython / ๐ป C / ๐ฆ Rust / โก Zig
- ๐ค Machine Learning
- ๐ฅ Debian OS
A Python asyncio wrapper for Tesseract-OCR.
License: Apache License 2.0
Currently I am a programmer at Supremo Tribunal Federal.
It would be nice to have access to tessedit_char_whitelist
so that I can whitelist only specific characters.
Hello,
It would be really nice if TESSERACT_CMD could be not constant but configurable to be able to use it in Windows too. Thanks.
Now check exception very hard. Need always take all exception.
May create from one, or two. But it is better from one.
Example:
class TesseractRuntimeError(RuntimeError):
pass
class TesseractError(Exception):
"""Base exception for tesseract"""
def __init__(self, message=""):
self.message = message
def __str__(self):
return self.message
class PSMInvalidException(TesseractError):
def __init__(self, message="PSM Invalid"):
super().__init__(message)
class OEMInvalidException(TesseractError):
def __init__(self, message="OEM Invalid"):
super().__init__(message)
class NoSuchFileException(TesseractError):
def __init__(self, message="No such file"):
super().__init__(message)
class LanguageInvalidException(TesseractError):
def __init__(self, message="Language invalid"):
super().__init__(message)
It would be great if and 'TesseractRuntimeError' was also inherited from 'TesseractError'. On you decision...
Next )
Exists yet problem.
Tesseract by default not need dpi, then it will be tried to select appropriate dpi himself.
# man tesseract
--dpi N
Specify the resolution N in DPI for the input image(s). A typical value for N is 300. Without this option, the resolution is read from the metadata
included in the image. If an image does not include that information, Tesseract tries to guess it.
This is not possible in your implementation. The value for 'dpi' always has something value.
May be change it? If need I may create PR )
Linux
import aiopytesseract
params = await aiopytesseract.tesseract_parameters()
[p for p in params if p.name == 'tessedit_char_blacklist']
[Parameter(name='tessedit_char_blacklist', description='Blacklist of chars not to recogniz', value='-')]
[]
Hi,
Thanks for sharing this code!
There is any option to add this two parameters to image_to_data
function?
Best regards!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.