chinese_ocr

yolo3 + densenet + ctc ocr

setup

see setup

dowon model

densenet model used for 5990 chars
url：https://pan.baidu.com/s/1gm0Uq_sLe00En-IbUPiQUg password ：qcco
put the model file in project_root/chinese_ocr/models/densenet_base_model/1
densenet model used for 7476 chars
url：https://pan.baidu.com/s/1_eGdF9odvzziJn35wOzQlA password ：jve5 put the model file in project_root/chinese_ocr/models/densenet_base_model/2
other model url: https://pan.baidu.com/s/10t5BYHm-YJXb9NpT7OnIOg
password: 8zbx
put the model file in project_root/chinese_ocr/models/

模型效果

目前提供的模型只适合学习使用，只用当前代码在生成的数据集上训练了很多轮保存的最好的一个版本，但不足以商用，你可以自己用代码训练更好的模型，参考白翔老师的crnn也是个不错的选择

test

python demo.py

you can also see understand_detect

train

cd train

python train.py or you can use train_with_param to deal with different dataset

dataset format

---dataset
--images
--xxx.jpg
--data_train.txt
--data_test.txt

dataset

this dataset is generate by code.

link：https://pan.baidu.com/s/1JgS1gSRcfnjWF_epU-E2vA password：wigu

The dataset contains 800,000 pictures 300,000 from chinese novel
100,000 from random number 0-9
100,000 from random code
300,000 random selected by it's frequency

Random char space
Random font size
10 different fonts
Blur
noise(gauss,uniform,salt_pepper,poisson)
...

for more detial see train_with_param

Or you can use YCG09's dataset to train,url:

url：https://pan.baidu.com/s/1QkI7kjah8SPHwOQ40rS1Pw (passwd：lu7m)

put your dataset into train/images and change the label file data_test.txt data_train.txt

generate you own dataset

or you can generate your own dataset:

text location:
SynthText
text recognition
TextRecognitionDataGenerator
text_renderer (which one I used )
you can use tools/tmp_label_to_id_label.py to change label file format to what we need here

update

use pretrain model to detect word
- add demo √
- add densenet training code √
- test gpu nms √
- generate my own dataset √
add framework to easy train on your own dataset
- add yolo3 train code
- make the code can be easy use on other dataset

Reference

https://github.com/chineseocr/chineseocr https://github.com/YCG09/chinese_ocr

znsoftm / chinese_ocr-1 Goto Github PK

chinese_ocr-1's Introduction

chinese_ocr

setup

dowon model

模型效果

test

train

dataset format

dataset

generate you own dataset

update

Reference

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent