jyouhou / icdar2019-art-recognition-alchemy Goto Github PK

View Code? Open in Web Editor NEW

219.0 219.0 67.0 6.18 MB

PKU Team Zero's code for participation in ICDAR2019 ArT Recognition track (Champion)

License: MIT License

Shell 0.05% Python 1.25% Roff 98.70%

icdar2019-art-recognition-alchemy's People

Contributors

Stargazers

Watchers

Forkers

juandai8401 liu100286 banyueqin lijun20 yangchao0053 zonasw wjinhai wangxiong101 xuweidongkobe dodgaga xiangliu886 winterxx yuckfu zengqi0730 taowenleon aprilyapingzhang shubhampachori12110095 othello1111 robingong baifanysu liyucode jingmouren fendaq hajungong007 ishine deeplearning2012 wwwanghao extrememart alwc trendingtechnology qutrino chenjun2hao kapitsa2811 quuhua911 xiaoyubing gottacatchemai hell-to-heaven yacobby fireae qf6101 2016xjtuzyt chengmuni66 verazjy holygen secortot 17666107783 dogewbx tukjet liutianling swpu-computer shicaiyuan duxiangcheng aaaaaron jiyuxuan926 fengpan1010 ustczhouyu dimplesl yuanwei0908 yanshuang17 liuheng0111 dikubab raghavsonavane enchanterfan acproject deanofthewebb wode2016501 lanchonav

icdar2019-art-recognition-alchemy's Issues

关于REC_SQUARE模式的问题

Hi，你好，我使用了REC_SQUARE模式进行了模型的训练，当我使用REC_SQUARE模式测试同一张图片时，发现utils.py里面TextSquare函数中
aspect_ratio_augment = random.uniform(0.7, 1.3)
是一个随机数，导致每次同一张图片预测得到的结果不一样，请问我该如何解决？

Which model is the Unet for the feature extraction at very beginning?

Can the network structure recognize Chinese?

Uplad CurvedSynth in Google Drive?

Could you please also upload CurvedSynth dataset in Google Drive, it's hard to download in Baidu?

hi,i met a problem when ran the demo:
size mismatch for rec_head.decoder.tgt_embedding.weight: copying a param with shape torch.Size([91, 256]) from checkpoint, the shape in current model is torch.Size([72, 256]).
any detaills i don't get?thanks

Do you plan to provide the Dockerfile?

whether you fixed the image size

Did you fix the size of the picture during the training? Or fill it with the widest possible image.

it's a nice day for update the code.

大佬什么时候发布源码，很期待哦，感谢

4.img.tar.gz is not uploaded?

I could not find 4.img.tar.gz file in the uploaded IMG folder?
and also please check the size of 3.img.tar.gz that it has less size in comparison to others?

RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

Have the below error when i use the Rectotal data set to train , could you please give me advice to fix it ? thanks very much.

[2019-09-16 03:31:53]	Evaluation: [1091/1101]	Time 0.030 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:53]	Evaluation: [1092/1101]	Time 0.027 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:53]	Evaluation: [1093/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1094/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1095/1101]	Time 0.027 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1096/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1097/1101]	Time 0.025 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1098/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1099/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
[2019-09-16 03:31:54]	Evaluation: [1100/1101]	Time 0.026 (0.029)	Data 0.000 (0.000)	
Traceback (most recent call last):
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/examples/main.py", line 271, in <module>
    main(args)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/examples/main.py", line 244, in main
    dataset=test_dataset)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/evaluators.py", line 61, in evaluate
    output_dict = self._forward(input_dict)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/evaluators.py", line 189, in _forward
    output_dict = self.model(input_dict)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/models/RectificationBaseline.py", line 107, in forward
    [rectified_feat, rec_targets, rec_lengths])
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/models/attention_recognition_head.py", line 61, in sample
    encoder_feats = self.encoder(x)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/workspace/ICDAR2019-ArT-Recognition-Alchemy-master/Source/models/recognition_subnet.py", line 112, in forward
    cnn_feat = cnn_feat.transpose(2, 1)
RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

Environment problems

when I want to create the conda environment, warning and error occurs.
could you please give me advice to fix it ? thanks very much.

conda env create -f environment.yml

Warning: you have pip-installed dependencies in your environment file, but you do not list pip itself as one of your conda dependencies. Conda may not use the correct pip to install your packages, and they may end up in the wrong place. Please add an explicit pip dependency. I'm adding one for you, but still nagging you.

CondaValueError: prefix already exists: /home/geart-jzw/anaconda3

About squarization and voting mechanism

Hi, thanks for sharing this great work. I wonder if you can help me with the following questions:

About the effectiveness of squarization, I think it should be very useful for case "3" in the above figure. Have you counted how many images in the dataset are similar to case "3"?
And for case "2" and "4", I think you can rotate them based on their aspect ratios. This way they might become case "1" and "5" which seems more suitable for original preprocessing method, i.e. resize input images to 64x256. Have you tried this and compare the results with squarization?
About voting mechanism in ICDAR2019, what if 4 models predict totally different words? For example, model_0 predict "hello", model_1 predict "yello", model_2 predict "fello", model_1 predict "jello", how to apply voting mechanism in this scenario?
Thanks!

The code, & training Arabic recognition

@Jyouhou @PkuDavidGuan Thank you for your hard work,

When do you expect to pload the code?
Can I train to recognize non-latin scripts like Arabic which reads (from Right-to Left)?

speed

would u mind sharing the speed

Single image demonstration code

Could you provide single image demonstration code for each three model?
I would like to test the model with my own test images to see its performance, but got lost in the experiment scripts

The code

when the code will be pushed. nice work.

some problem about the code

the function of real_multiplier? I read the code ,but I can not understand.
anyone explains the code? Thanks very much.

dataset_idx = bisect.bisect_right(self.cumulative_sizes, idx) if dataset_idx == 0: sample_idx = idx else: sample_idx = idx - self.cumulative_sizes[dataset_idx - 1] if self.datasets[dataset_idx].real_world: return self.datasets[dataset_idx][sample_idx % len(self.datasets[dataset_idx])] return self.datasets[dataset_idx][sample_idx]