Coder Social home page Coder Social logo

lornatang / crnn-pytorch Goto Github PK

View Code? Open in Web Editor NEW
8.0 1.0 3.0 11.47 MB

PyTorch implemnts `An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition` paper.

License: Apache License 2.0

Python 100.00%
ocr ocr-recognition pytorch crnn-ctc

crnn-pytorch's Introduction

Hi, my dear friend, I am committed to promoting the development of GAN.

Anurag's github stats

crnn-pytorch's People

Contributors

lornatang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

crnn-pytorch's Issues

CER calculation

Hi, thanks for the great work first and foremost!

I noticed that the accuracy calculation is not quite the CER needed. I was wondering if this is just a temporary fix or what does this: "".join(prediction_chars[0]) == target[0].lower() mean?

关于使用自己数据集进行训练出错

使用自己训练集并且将config文件修改为train模式后 使用python train.py 命令出现 如下报错:
Load all datasets successfully.
Build CRNN model successfully.
Define all loss functions successfully.
Define all optimizer functions successfully.
Check whether the pretrained model is restored...

  • Acc: 0.00%
    Traceback (most recent call last):
    File "train.py", line 364, in
    main()
    File "train.py", line 85, in main
    acc = validate(model, test_dataloader, epoch, writer, "Valid")
    File "train.py", line 291, in validate
    raise ValueError("Unsupported mode, please use valid or test.")
    ValueError: Unsupported mode, please use valid or test.

Questions on training hyperparameters

Hi Lorna,
i was hoping you could shed light on some of the training hyperparmeter choices. I have tried using the ICDAR2013 dataset which is significantly smaller than MJSynth. For training and validation, i use its train and test set respectively. However the model seems to be converging to outputting complete blanks. I tried making the learning rate to 1e-3 for example and adding a cosine-annealing lr to it. But it doesn't seem to matter. So my questions are (also for when i move onto MJSynth training later):

  1. From your experience, where could the problem lie? Is the amount of data from ICDAR2013 too few without data augmentation for this model?
  2. I'm not versed in mixed-precision training, but could this be messing with the model's convergence at the start, since the scaler's scale is somewhere around 1e4?
  3. What kind of preprocessing are used for characters that are not supported by the model, i.e. "-"? Do you omit samples containing such characters or do we just ignore it in the ground truth labeling?
  4. If you can share, what learning rate schedule did you use for your pretrained model?

Thanks for sharing your experience as you have been! Very appreciated! Thanks in advance!

Regards,
Zheng

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.