Comments (7)
Currently re-testing Calamari .yml installation files, I found some conflicts using the latest Calamari master.
Don't worry, I am working on pushing updated .yml files.
The updated environment versions will:
1- Remove the specification of each package version, so that Conda can handle the versions itself.
2- State the version of only the packages that define the rest.
Error message:
home@home-lnx:~/Desktop/programs/calamari$ conda env create -f environment_master_cpu.yml
Solving environment: failed
UnsatisfiableError: The following specifications were found to be in conflict:
- mkl==2019.0=118
- scipy==1.1.0=py36hfa4b5c9_1 -> mkl[version='>=2018.0.3,<2019.0a0']
Use "conda info <package>" to see the dependencies for each package.
from calamari.
- Solved the conflicts by removing the package versions.
- Solved the Tensorflow issue by giving conda
defaults
repo the priority overconda-forge
.
from calamari.
Topic: Branch 41-on-the-fly-data-loading
@ChWick I just tested 41-on-the-fly-data-loading
branch.
Created a Conda python 3.7 environment, installed everything using conda except tensorflow, which I installed using pip install tf-nightly
, here is the error message when trying to train:
(calamari) home@home-lnx:~/Desktop/Untitled Folder/t1$ calamari-train --files *.png
Resolving input files
Found 101 files in the dataset
Preloading dataset type DataSetMode.TRAIN with size 101
Loading Dataset: 100%|██████████████████████| 101/101 [00:00<00:00, 1408.48it/s]
Text Preprocessing: 100%|███████████████████| 101/101 [00:00<00:00, 3822.33it/s]
Data Preprocessing: 100%|████████████████████| 101/101 [00:00<00:00, 142.98it/s]
Computing codec: 100%|████████████████████| 101/101 [00:00<00:00, 247011.49it/s]
CODEC: ['', 'ء', 'ؤ', 'ئ', 'ا', 'ب', 'ة', 'ت', 'ث', 'ج', 'ح', 'خ', 'د', 'ذ', 'ر', 'ز', 'س', 'ش', 'ص', 'ض', 'ط', 'ظ', 'ع', 'غ', 'ف', 'ق', 'ك', 'ل', 'م', 'ن', 'ه', 'و', 'ى', 'ي']
2018-12-12 01:22:59.545502: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3292290000 Hz
2018-12-12 01:22:59.545806: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x3df8f70 executing computations on platform Host. Devices:
2018-12-12 01:22:59.545832: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/tensorflow/python/data/ops/dataset_ops.py:415: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
tf.py_function, which takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means `tf.py_function`s can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:257: DatasetV1.make_initializable_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `for ... in dataset:` to iterate over a dataset. If using `tf.estimator`, return the `Dataset` object directly from your input function. As a last resort, you can use `tf.compat.v1.data.make_initializable_iterator(dataset)`.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/tensorflow/python/data/ops/dataset_ops.py:1401: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:92: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:101: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.max_pooling2d instead.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:104: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Using CUDNN compatible LSTM backend on CPU
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/tensorflow/contrib/rnn/python/ops/rnn.py:233: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/tensorflow/python/ops/rnn.py:443: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:181: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Using CUDNN compatible LSTM backend on CPU
Traceback (most recent call last):
File "/home/home/anaconda3/envs/calamari/bin/calamari-train", line 11, in <module>
load_entry_point('calamari-ocr==0.2.2', 'console_scripts', 'calamari-train')()
File "/home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/scripts/train.py", line 303, in main
run(args)
File "/home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/scripts/train.py", line 295, in run
progress_bar=not args.no_progress_bars
File "/home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/trainer.py", line 172, in train
self._run_train(train_net, test_net, codec, train_start_time, progress_bar)
File "/home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/trainer.py", line 194, in _run_train
validation_txts = validation_dataset.preloaded_texts
AttributeError: 'NoneType' object has no attribute 'preloaded_texts'
from calamari.
@mrocr Should be fixed. Do not forget to test the --train_data_on_the_fly
parameter.
from calamari.
Topic: Branch 41-on-the-fly-data-loading | Error: Argument list too long
Calamari-fly is working on smaller samples of 100 images, but when testing on a sample of +1,000,000 images an error message of Argument list too long
appears, which is normal due to limitations.
@ChWick do you recommend a way to overcome this? or perhaps loading a list using find?
find -type f -name '*.png' > list.txt
(calamari) home@home-lnx:~/Desktop/Untitled Folder$ calamari-train --train_data_on_the_fly --files ./training_data/*.png
bash: /home/home/anaconda3/envs/calamari/bin/calamari-train: Argument list too long
from calamari.
@mrocr Just escape the asterix and let calamari resolve the files: calamari-train --train_data_on_the_fly --files ./training_data/\*.png
. Your shell (e.g. zsh) resolves it otherwise.
from calamari.
@ChWick You really did it man, thanks for the stream-training & the codec white-listing. Your Amazing!!!!!!!!
from calamari.
Related Issues (20)
- calamari-predict: Exception, when prediction is 0% HOT 3
- calamari-train: Exception 'NoneType' object has no attribute 'astype' HOT 4
- encoding of arabic characters in the confusion file is wrong HOT 2
- calamari-predict truncates filename HOT 3
- output_dir parameter does not work as described in doc string
- PAGE XML: explicit namespace prefixes are missing when writing new elements HOT 3
- calamari-predict: Glyph Coordinates are "strange"
- Argument "val.preload" documented but not known HOT 1
- Cannot convert a symbolic Tensor - Cannot even initialize the Predictor object HOT 2
- Characters coordinates HOT 1
- training: Cannot convert a symbolic Tensor to a numpy array HOT 7
- HDF5 dataset format: how to convert HOT 4
- calamari-train: warmstart not working without also giving network spec
- featreq: when warmstart-training, init weights of new chars from existing ones HOT 2
- calamari-eval: skip missing pairs HOT 3
- calamari-eval: unknown arguments HOT 6
- calamari-eval: confusion table miscalculates relative frequency HOT 3
- Error when convert old trained model to latest version model HOT 1
- Got exception during training HOT 4
- calamari-ocr 2.2.2 on ubuntu 22.04 partial success, difficulty with GPU software
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from calamari.