Comments (12)
Thanks @jbarth-ubhd for the detailled report and analysis.
Simple reason: osd.traineddata is missing. Used to get installed – checking why not.
from ocrd_all.
Got it!
Line 719 in dab852a
… must now be $(VIRTUAL_ENV)/share/ocrd-resources/ocrd-tesserocr-recognize
.
So we have a mismatch between the install-time location and the runtime/resmgr location.
from ocrd_all.
must now be
$(VIRTUAL_ENV)/share/ocrd-resources/ocrd-tesserocr-recognize
.
No, that would not work either, because we use configure --prefix=$(VIRTUAL_ENV)
, so Tesseract will be compiled for the share/tessdata.
Rather, there was a superflous environment variable override:
Line 47 in dab852a
from ocrd_all.
Just wanted to check ocrd resmgr list-available
on my workstation (ubuntu 20.04, docker, docker pulled a lot of files for ocrd/all):
jb@pers16:~> alias docker_ocrd
alias docker_ocrd='sudo docker run --user $(id -u) --workdir /data --volume $PWD/data:/data --volume $PWD/models:/
►usr/local/share/ocrd-resources ocrd/all'
jb@pers16:~> docker_ocrd ocrd resmgr list-available
Traceback (most recent call last):
File "/usr/local/bin/ocrd", line 33, in <module>
sys.exit(load_entry_point('ocrd', 'console_scripts', 'ocrd')())
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/build/core/ocrd/ocrd/cli/resmgr.py", line 47, in list_available
resmgr = OcrdResourceManager()
File "/build/core/ocrd/ocrd/resource_manager.py", line 34, in __init__
self.user_list.parent.mkdir(parents=True)
File "/usr/lib/python3.6/pathlib.py", line 1248, in mkdir
self._accessor.mkdir(self, mode)
File "/usr/lib/python3.6/pathlib.py", line 387, in wrapped
return strfunc(str(pathobj), *args)
PermissionError: [Errno 13] Permission denied: '/.config/ocrd'
from ocrd_all.
ah... with --volume $PWD/.config:/.config it works
jb@pers16:~> sudo docker run --user $(id -u) --workdir /data --volume $PWD/data:/data --volume $PWD/models:/usr/
►local/share/ocrd-resources --volume $PWD/.config:/.config ocrd/all ocrd resmgr list-available
ocrd-tesserocr-recognize
- Fraktur_GT4HistOCR.traineddata (https://ub-backup.bib.uni-mannheim.de/~stweil/ocrd-train/data/Fraktur_5000000/
►tessdata_fast/Fraktur_50000000.334_450937.traineddata)
Tesseract LSTM model trained on GT4HistOCR
- ONB.traineddata (https://ub-backup.bib.uni-mannheim.de/~stweil/ocrd-train/data/ONB/tessdata_best/
►ONB_1.195_300718_989100.traineddata)
Tesseract LSTM model based on Austrian National Library newspaper data
- equ.traineddata (https://github.com/tesseract-ocr/tessdata_fast/raw/main/equ.traineddata)
Tesseract equ model
...
from ocrd_all.
... almost
jb@pers16:~> docker_ocrd ocrd resmgr download ocrd-tesserocr-recognize configs
12:30:17.190 INFO ocrd.cli.resmgr - Downloading resource {'url': 'https://github.com/tesseract-ocr/tesseract/
►archive/main.tar.gz', 'name': 'configs', 'description': 'Tesseract configs (parameter sets) for use with the
►standalone tesseract CLI', 'size': 1915529, 'type': 'tarball', 'path_in_archive': 'tesseract-main/tessdata/configs
►', 'parameter_usage': 'as-is', 'version_range': '>= 0.0.1'}
12:30:17.193 INFO ocrd.resource_manager._download_impl - Downloading https://github.com/tesseract-ocr/tesseract/
►archive/main.tar.gz to download.tar.xx
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/urllib3/connection.py", line 175, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py", line 72, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
...
Is this my ubuntu 20.04 with dnsmasq in NetworkManager.conf?
root@pers16:/home/jb# cat /etc/NetworkManager/NetworkManager.conf
[main]
plugins=ifupdown,keyfile,ofono
dns=dnsmasq
no-auto-default=00:01:02:12:40:C5,00:21:9B:5E:BE:17,90:1B:0E:42:7D:AE,
[ifupdown]
managed=false
from ocrd_all.
sudo docker run --dns A.B.C.D ...
helped.
from ocrd_all.
BTW no osd.traineddata in ~/models/ocrd-tesserocr-recognize/
from ocrd_all.
ah... with --volume $PWD/.config:/.config it works
yes, sorry, we forgot to document this on https://ocr-d.de/en/models#models-and-docker
now tracking under OCR-D/ocrd-website#318
from ocrd_all.
BTW no osd.traineddata in ~/models/ocrd-tesserocr-recognize/
like I said above (see PR with fix), there must not be TESSDATA_PREFIX
at install time (make all or make install-tesseract).
from ocrd_all.
sudo docker run --dns A.B.C.D ...
helped.
I remember seeing this problem before. Also happens at build-time (docker build). You can also try with --network=host
or --network=bridge
.
from ocrd_all.
schnief (german)
from ocrd_all.
Related Issues (20)
- "make all" creates "fatal error" with "submodule 'ocrd_fileformat'" (but goes on...) HOT 2
- /models not working HOT 4
- Provide date-based alias for maximum-git
- frak models in ocrd resmgr HOT 27
- empty OCR HOT 13
- model download in Docker only allowed for root HOT 4
- no word coordinates? HOT 6
- Docker: interference with older versions of core HOT 3
- Docker: build CD images sequentially HOT 1
- Use annotated tags for new releases
- `make check` fails with latest code because of missing ocrd-tesserocr-binarize HOT 5
- 2nd build stops waiting for user input HOT 1
- ocrd-import does not work HOT 4
- Docker: Logfile permissions problems HOT 3
- `make check` fails since January
- `make all` fails for Python 3.7 HOT 4
- Docker: build multi-architecture images HOT 2
- qurator namespace pkg problems are back HOT 4
- Broken build (ocrd_detectron2) HOT 2
- Broken builds on Ubuntu 20.04 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ocrd_all.