Coder Social home page Coder Social logo

General about calamari HOT 7 CLOSED

calamari-ocr avatar calamari-ocr commented on August 16, 2024
General

from calamari.

Comments (7)

 avatar commented on August 16, 2024

Currently re-testing Calamari .yml installation files, I found some conflicts using the latest Calamari master.
Don't worry, I am working on pushing updated .yml files.

The updated environment versions will:
1- Remove the specification of each package version, so that Conda can handle the versions itself.
2- State the version of only the packages that define the rest.

Error message:

home@home-lnx:~/Desktop/programs/calamari$ conda env create -f environment_master_cpu.yml
Solving environment: failed

UnsatisfiableError: The following specifications were found to be in conflict:
  - mkl==2019.0=118
  - scipy==1.1.0=py36hfa4b5c9_1 -> mkl[version='>=2018.0.3,<2019.0a0']
Use "conda info <package>" to see the dependencies for each package.

from calamari.

 avatar commented on August 16, 2024

#50

  • Solved the conflicts by removing the package versions.
  • Solved the Tensorflow issue by giving conda defaults repo the priority over conda-forge.

from calamari.

 avatar commented on August 16, 2024

Topic: Branch 41-on-the-fly-data-loading

@ChWick I just tested 41-on-the-fly-data-loading branch.
Created a Conda python 3.7 environment, installed everything using conda except tensorflow, which I installed using pip install tf-nightly, here is the error message when trying to train:

(calamari) home@home-lnx:~/Desktop/Untitled Folder/t1$ calamari-train --files *.png
Resolving input files
Found 101 files in the dataset
Preloading dataset type DataSetMode.TRAIN with size 101
Loading Dataset: 100%|██████████████████████| 101/101 [00:00<00:00, 1408.48it/s]
Text Preprocessing: 100%|███████████████████| 101/101 [00:00<00:00, 3822.33it/s]
Data Preprocessing: 100%|████████████████████| 101/101 [00:00<00:00, 142.98it/s]
Computing codec: 100%|████████████████████| 101/101 [00:00<00:00, 247011.49it/s]
CODEC: ['', 'ء', 'ؤ', 'ئ', 'ا', 'ب', 'ة', 'ت', 'ث', 'ج', 'ح', 'خ', 'د', 'ذ', 'ر', 'ز', 'س', 'ش', 'ص', 'ض', 'ط', 'ظ', 'ع', 'غ', 'ف', 'ق', 'ك', 'ل', 'م', 'ن', 'ه', 'و', 'ى', 'ي']
2018-12-12 01:22:59.545502: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3292290000 Hz
2018-12-12 01:22:59.545806: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x3df8f70 executing computations on platform Host. Devices:
2018-12-12 01:22:59.545832: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/tensorflow/python/data/ops/dataset_ops.py:415: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
    tf.py_function, which takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:257: DatasetV1.make_initializable_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `for ... in dataset:` to iterate over a dataset. If using `tf.estimator`, return the `Dataset` object directly from your input function. As a last resort, you can use `tf.compat.v1.data.make_initializable_iterator(dataset)`.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/tensorflow/python/data/ops/dataset_ops.py:1401: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:92: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:101: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.max_pooling2d instead.
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:104: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Using CUDNN compatible LSTM backend on CPU
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/tensorflow/contrib/rnn/python/ops/rnn.py:233: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/tensorflow/python/ops/rnn.py:443: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From /home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py:181: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Using CUDNN compatible LSTM backend on CPU
Traceback (most recent call last):
  File "/home/home/anaconda3/envs/calamari/bin/calamari-train", line 11, in <module>
    load_entry_point('calamari-ocr==0.2.2', 'console_scripts', 'calamari-train')()
  File "/home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/scripts/train.py", line 303, in main
    run(args)
  File "/home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/scripts/train.py", line 295, in run
    progress_bar=not args.no_progress_bars
  File "/home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/trainer.py", line 172, in train
    self._run_train(train_net, test_net, codec, train_start_time, progress_bar)
  File "/home/home/anaconda3/envs/calamari/lib/python3.7/site-packages/calamari_ocr-0.2.2-py3.7.egg/calamari_ocr/ocr/trainer.py", line 194, in _run_train
    validation_txts = validation_dataset.preloaded_texts
AttributeError: 'NoneType' object has no attribute 'preloaded_texts'

from calamari.

ChWick avatar ChWick commented on August 16, 2024

@mrocr Should be fixed. Do not forget to test the --train_data_on_the_fly parameter.

from calamari.

 avatar commented on August 16, 2024

Topic: Branch 41-on-the-fly-data-loading | Error: Argument list too long

Calamari-fly is working on smaller samples of 100 images, but when testing on a sample of +1,000,000 images an error message of Argument list too long appears, which is normal due to limitations.

@ChWick do you recommend a way to overcome this? or perhaps loading a list using find?
find -type f -name '*.png' > list.txt

(calamari) home@home-lnx:~/Desktop/Untitled Folder$ calamari-train --train_data_on_the_fly --files ./training_data/*.png
bash: /home/home/anaconda3/envs/calamari/bin/calamari-train: Argument list too long

from calamari.

ChWick avatar ChWick commented on August 16, 2024

@mrocr Just escape the asterix and let calamari resolve the files: calamari-train --train_data_on_the_fly --files ./training_data/\*.png. Your shell (e.g. zsh) resolves it otherwise.

from calamari.

 avatar commented on August 16, 2024

@ChWick You really did it man, thanks for the stream-training & the codec white-listing. Your Amazing!!!!!!!!

from calamari.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.