Coder Social home page Coder Social logo

cdistnet's Introduction

CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

The official code of CDistNet.

pipline

To Do List

  • HA-IC13 & CA-IC13
  • Pre-train model
  • Cleaned Code
  • Document
  • Distributed Training

Two New Datasets

we test other sota method in HA-IC13 and CA-IC13 datasets.

HA_CA CDistNet has a performance advantage over other SOTA methods as the character distance increases (1-6)

HA-IC13

Method 1 2 3 4 5 6 Code & Pretrain model
VisionLAN (ICCV 2021) 93.58 92.88 89.97 82.26 72.23 61.03 Offical Code
ABINet (CVPR 2021 ) 95.92 95.22 91.95 85.76 73.75 64.99 Offical Code
RobustScanner* (ECCV 2020) 96.15 95.33 93.23 88.91 81.10 71.53 --
Transformer-baseline* 96.27 95.45 92.42 86.46 79.35 72.46 --
CDistNet 96.62 96.15 94.28 89.96 83.43 77.71 --

CA-IC13

Method 1 2 3 4 5 6 Code & Pretrain model
VisionLAN (ICCV 2021) 94.87 92.77 84.01 75.03 64.29 52.74 Offical Code
ABINet (CVPR 2021 ) 96.62 95.92 87.86 76.31 65.46 54.49 Offical Code
RobustScanner* (ECCV 2020) 95.22 94.87 85.30 76.55 68.38 60.79 --
Transformer-baseline* 95.68 94.40 85.88 75.85 65.93 58.58 --
CDistNet 96.27 95.57 88.45 79.58 70.36 63.13 --

Datasets

The datasets are same as ABINet

Environment

package you can find in env_cdistnet.yaml.

#Installed
conda create -n CDistNet python=3.7
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=9.2 -c pytorch
pip install opencv-python mmcv notebook numpy einops tensorboardX Pillow thop timm tornado tqdm matplotlib lmdb

Pretrained Models

Get the pretrained models from BaiduNetdisk(passwd:d6jd), GoogleDrive. (We both offer training log and result.csv in same file.) The pretrained model should set in models/reconstruct_CDistNetv3_3_10

Performances of the pretrained models are summaried as follows:

Train

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --config=configs/CDistNet_config.py

Eval

CUDA_VISIBLE_DEVICES=0 python eval.py --config=configs/CDistNet_config.py

cdistnet's People

Contributors

simplify23 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.