Coder Social home page Coder Social logo

simplify23 / cdistnet Goto Github PK

View Code? Open in Web Editor NEW
106.0 106.0 18.0 1.66 MB

Official Pytorch implementations of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition(IJCV)

License: Apache License 2.0

Jupyter Notebook 53.39% Python 46.61%

cdistnet's Issues

Can it work with phone number ?

I trained the model with billboards but when I inference it, it doesn't work well with sequences of numbers or phone numbers. Can you help me? Thanks very much.

Inference Time

I tried your network and got a good result but I faced the problem of inference speed. could you please let me know I can increase the speed of recognition?

accuracy is lower than other models

I tried to train your model but I got accuracy is lower than other transformer models. could you please let me know how can I got higher accuracy ?

Any way to train only on language and not images?

I have a relatively small dataset of a different format (license plates) and it often gets license plate format wrong.

I was wondering if there was a way to train the model on just a bunch of text string data without feeding any images at all in order to enforce the format.

Please let me know if it is possible to train the language/semantic model independently, by just feeding string text data of words, without corresponding images.

CDistNetv2

@simplify23 are you planning to release CDistNetv2 code?
waiting for light weight and faster module

train for other language

hello thanks for your paper and released codes
I want to train your code for other language but I see in lmdbdataset that you use English char and limit the max length to 30 that is true?
I should change line 245 and 246?

`def len(self):
return self.length

def get(self,idx):
    with self.env.begin(write=False) as txn:
        image_key, label_key = f'image-{idx+1:09d}', f'label-{idx+1:09d}'
        label = str(txn.get(label_key.encode()), 'utf-8')  # label
        label = re.sub('[^0-9a-zA-Z]+', '', label)
        label = label[:30]`

Attention Maps

Could you please release the code to generate the attention maps as published in the paper

Open source license?

Are you willing to specify an open-source license such as a MIT License?
The github has no license specified.

about inference

How to set the parameters of input_char,such as predict a new image

Questions on Conv2d of Transformer Layers

Hello, while examining the code,

I noticed that most of the nn.Linear() operations are replaced with nn.Conv2d(kernel_size=(1,1)) operations

when comparing nn.Transformer and the implementation of the code.

Is there a benefit for such replacement?

Missing transformer

When trying to run test.py I get the following error:

(CDistNet) C:\<path>\CDistNet>python test.py --i_path ..\examples\300_0.jpg 
configs/CDistNet_config.py
<class 'str'>
Traceback (most recent call last):
  File "test.py", line 175, in <module>
    main()
  File "test.py", line 168, in main
    test_one(cfg, args)
  File "test.py", line 126, in test_one
    en = get_parameter_number(model.transformer.encoder)
  File "C:\<path>\miniconda3\envs\CDistNet\lib\site-packages\torch\nn\modules\module.py", line 1178, in __getattr__ 
    type(self).__name__, name))
AttributeError: 'CDistNet' object has no attribute 'transformer'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.