Coder Social home page Coder Social logo

icdar2019-art-recognition-alchemy's People

Contributors

jyouhou avatar pkudavidguan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

icdar2019-art-recognition-alchemy's Issues

关于REC_SQUARE模式的问题

Hi,你好,我使用了REC_SQUARE模式进行了模型的训练,当我使用REC_SQUARE模式测试同一张图片时,发现utils.py里面TextSquare函数中
aspect_ratio_augment = random.uniform(0.7, 1.3)
是一个随机数,导致每次同一张图片预测得到的结果不一样,请问我该如何解决?

is the checkpoint right?

hi,i met a problem when ran the demo:
size mismatch for rec_head.decoder.tgt_embedding.weight: copying a param with shape torch.Size([91, 256]) from checkpoint, the shape in current model is torch.Size([72, 256]).
any detaills i don't get?thanks

4.img.tar.gz is not uploaded?

I could not find 4.img.tar.gz file in the uploaded IMG folder?
and also please check the size of 3.img.tar.gz that it has less size in comparison to others?

RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

Have the below error when i use the Rectotal data set to train , could you please give me advice to fix it ? thanks very much.

[2019-09-16 03:31:53]	Evaluation: [1091/1101]	Time 0.030 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:53]	Evaluation: [1092/1101]	Time 0.027 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:53]	Evaluation: [1093/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1094/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1095/1101]	Time 0.027 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1096/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1097/1101]	Time 0.025 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1098/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1099/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1100/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
Traceback (most recent call last):
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/examples/main.py", line 271, in <module>
    main(args)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/examples/main.py", line 244, in main
    dataset=test_dataset)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/evaluators.py", line 61, in evaluate
    output_dict = self._forward(input_dict)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/evaluators.py", line 189, in _forward
    output_dict = self.model(input_dict)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/models/RectificationBaseline.py", line 107, in forward
    [rectified_feat, rec_targets, rec_lengths])
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/models/attention_recognition_head.py", line 61, in sample
    encoder_feats = self.encoder(x)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/models/recognition_subnet.py", line 112, in forward
    cnn_feat = cnn_feat.transpose(2, 1)
RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

Environment problems

when I want to create the conda environment, warning and error occurs.
could you please give me advice to fix it ? thanks very much.

conda env create -f environment.yml

Warning: you have pip-installed dependencies in your environment file, but you do not list pip itself as one of your conda dependencies. Conda may not use the correct pip to install your packages, and they may end up in the wrong place. Please add an explicit pip dependency. I'm adding one for you, but still nagging you.

CondaValueError: prefix already exists: /home/geart-jzw/anaconda3

About squarization and voting mechanism

Hi, thanks for sharing this great work. I wonder if you can help me with the following questions:
figure5

  1. About the effectiveness of squarization, I think it should be very useful for case "3" in the above figure. Have you counted how many images in the dataset are similar to case "3"?
  2. And for case "2" and "4", I think you can rotate them based on their aspect ratios. This way they might become case "1" and "5" which seems more suitable for original preprocessing method, i.e. resize input images to 64x256. Have you tried this and compare the results with squarization?
  3. About voting mechanism in ICDAR2019, what if 4 models predict totally different words? For example, model_0 predict "hello", model_1 predict "yello", model_2 predict "fello", model_1 predict "jello", how to apply voting mechanism in this scenario?
    Thanks!

speed

would u mind sharing the speed

Single image demonstration code

Could you provide single image demonstration code for each three model?
I would like to test the model with my own test images to see its performance, but got lost in the experiment scripts

The code

when the code will be pushed. nice work.

some problem about the code

the function of real_multiplier? I read the code ,but I can not understand.
anyone explains the code? Thanks very much.

dataset_idx = bisect.bisect_right(self.cumulative_sizes, idx) if dataset_idx == 0: sample_idx = idx else: sample_idx = idx - self.cumulative_sizes[dataset_idx - 1] if self.datasets[dataset_idx].real_world: return self.datasets[dataset_idx][sample_idx % len(self.datasets[dataset_idx])] return self.datasets[dataset_idx][sample_idx]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.