Coder Social home page Coder Social logo

table-ocr's Introduction

本项目基于yolo3crnn 实现中文自然场景文字检测及识别

训练代码(master分支)

ocr训练数据集

ocr ctc训练数据集(压缩包解码:chineseocr)
百度网盘地址:链接: https://pan.baidu.com/s/1UcUKUUELLwdM29zfbztzdw 提取码: atwn
gofile地址:http://gofile.me/4Nlqh/uT32hAjbx 密码 https://github.com/chineseocr/chineseocr

实现功能

  • 文字方向检测 0、90、180、270度检测(支持dnn/tensorflow)
  • 支持(darknet/opencv dnn /keras)文字检测,支持darknet/keras训练
  • 不定长OCR训练(英文、中英文) crnn\dense ocr 识别及训练 ,新增pytorch转keras模型代码(tools/pytorch_to_keras.py)
  • 支持darknet 转keras, keras转darknet, pytorch 转keras模型
  • 身份证/火车票结构化数据识别
  • 新增CNN+ctc模型,支持DNN模块调用OCR,单行图像平均时间为0.02秒以下
  • CPU版本加速
  • 支持基于用户字典OCR识别
  • 新增语言模型修正OCR识别结果
  • 支持树莓派实时识别方案

环境部署

GPU部署 参考:setup.md
CPU部署 参考:setup-cpu.md

下载编译darknet(如果直接运用opencv dnn或者keras yolo3 可忽略darknet的编译)

git clone https://github.com/pjreddie/darknet.git 
mv darknet chineseocr/
##编译对GPU、cudnn的支持 修改 Makefile
#GPU=1
#CUDNN=1
#OPENCV=0
#OPENMP=0
make 

修改 darknet/python/darknet.py line 48
root = '/root/'##chineseocr所在目录
lib = CDLL(root+"chineseocr/darknet/libdarknet.so", RTLD_GLOBAL)

下载模型文件

模型文件地址:

模型转换(非必须)

pytorch ocr 转keras ocr

python tools/pytorch_to_keras.py  -weights_path models/ocr-dense.pth -output_path models/ocr-dense-keras.h5

darknet 转keras

python tools/darknet_to_keras.py -cfg_path models/text.cfg -weights_path models/text.weights -output_path models/text.h5

keras 转darknet

python tools/keras_to_darknet.py -cfg_path models/text.cfg -weights_path models/text.h5 -output_path models/text.weights

模型选择

参考config.py文件

构建docker镜像

##下载Anaconda3 python 环境安装包(https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh) 放置在chineseocr目录下   
##建立镜像   
docker build -t chineseocr .   
##启动服务   
docker run -d -p 8080:8080 chineseocr /root/anaconda3/bin/python app.py

web服务启动

cd chineseocr## 进入chineseocr目录
python app.py 8080 ##8080端口号,可以设置任意端口

访问服务

http://127.0.0.1:8080/ocr

识别结果展示

参考

  1. yolo3 https://github.com/pjreddie/darknet.git
  2. crnn https://github.com/meijieru/crnn.pytorch.git
  3. ctpn https://github.com/eragonruan/text-detection-ctpn
  4. CTPN https://github.com/tianzhi0549/CTPN
  5. keras yolo3 https://github.com/qqwweee/keras-yolo3.git
  6. darknet keras 模型转换参考 参考:https://www.cnblogs.com/shouhuxianjian/p/10567201.html
  7. 语言模型实现 https://github.com/lukhy/masr

table-ocr's People

Contributors

wenlihaoyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

table-ocr's Issues

转tf,进行测试

您好,我把darknet的网络和权重拿出来之后。最后一层的输出是logit层么 也就是未经logistic的,那么预测的时候
对'table/convolutional33/BiasAdd' 需要logistic么,还是类似 函数 dnn_table_predict 里 exp的操作

训练数据问题

大佬你好,我试了下提供的表格线分割模型,效果非常好,请问模型训练的数据是什么,方便提供下吗,感恩不尽!

调用出错

ubuntu系统,执行 python3 table.py -jpgPath test/dd.jpg报:

Traceback (most recent call last):
File "table.py", line 22, in
from darknet import load_net,predict_image,array_to_image
File "/data/services/table-ocr/darknet.py", line 52, in
lib = CDLL(darkRoot, RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: ../darknet/libdarknet.so: cannot open shared object file: No such file or directory

能看看是怎么回事吗

labelme 制作的mask图失真严重

原图为1600*2400大小,训练的时候需要将mask图resize,但是resize之后再显示出来,失真特别严重,这个问题作者遇到过没

about label

when use labelme to label image , Is it a straight line or a thin box?

找线时,表格会有缺失

Ubuntu 16.04
python3.7
无GPU,编译好了darknet,config也设置了darkRoot
在找线时,最后一列的表格会有缺失,请问要在这方法中修改参数rboxes,ColsLines,RowsLines = get_table_ceilboxes(img,prob=0.5,row=10,col=10,alph=10)
还有是和训练的模型有关系呢
essay0001-1

楼主,请教一下运行 python3 table.py -jpgPath test/dd.jpg

batch: Using default '1'
learning_rate: Using default '0.001000'
momentum: Using default '0.900000'
subdivisions: Using default '1'
policy: Using default 'constant'
max_batches: Using default '0'
layer filters size input output
0
conv 16 3 x 3 / 1 512 x 512 x 3 -> 512 x 512 x 16 0.226 BFLOPs
1 conv 16 3 x 3 / 1 512 x 512 x 16 -> 512 x 512 x 16 1.208 BFLOPs
2 max 2 x 2 / 2 512 x 512 x 16 -> 256 x 256 x 16
3 conv 32 3 x 3 / 1 256 x 256 x 16 -> 256 x 256 x 32 0.604 BFLOPs
4 conv 32 3 x 3 / 1 256 x 256 x 32 -> 256 x 256 x 32 1.208 BFLOPs
5 max 2 x 2 / 2 256 x 256 x 32 -> 128 x 128 x 32
6 conv 64 3 x 3 / 1 128 x 128 x 32 -> 128 x 128 x 64 0.604 BFLOPs
7 conv 64 3 x 3 / 1 128 x 128 x 64 -> 128 x 128 x 64 1.208 BFLOPs
8 max 2 x 2 / 2 128 x 128 x 64 -> 64 x 64 x 64
9 conv 128 3 x 3 / 1 64 x 64 x 64 -> 64 x 64 x 128 0.604 BFLOPs
10 conv 128 3 x 3 / 1 64 x 64 x 128 -> 64 x 64 x 128 1.208 BFLOPs
11 max 2 x 2 / 2 64 x 64 x 128 -> 32 x 32 x 128
12 conv 256 3 x 3 / 1 32 x 32 x 128 -> 32 x 32 x 256 0.604 BFLOPs
13 conv 256 3 x 3 / 1 32 x 32 x 256 -> 32 x 32 x 256 1.208 BFLOPs
14 max 2 x 2 / 2 32 x 32 x 256 -> 16 x 16 x 256
15 conv 512 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 512 0.604 BFLOPs
16 conv 512 3 x 3 / 1 16 x 16 x 512 -> 16 x 16 x 512 1.208 BFLOPs
17 max 2 x 2 / 2 16 x 16 x 512 -> 8 x 8 x 512
18 conv 1024 3 x 3 / 1 8 x 8 x 512 -> 8 x 8 x1024 0.604 BFLOPs
19 conv 1024 3 x 3 / 1 8 x 8 x1024 -> 8 x 8 x1024 1.208 BFLOPs
20 upsample 2x 8 x 8 x1024 -> 16 x 16 x1024
21 route 16 20
22 conv 512 3 x 3 / 1 16 x 16 x1536 -> 16 x 16 x 512 3.624 BFLOPs
23 conv 512 3 x 3 / 1 16 x 16 x 512 -> 16 x 16 x 512 1.208 BFLOPs
24 conv 512 3 x 3 / 1 16 x 16 x 512 -> 16 x 16 x 512 1.208 BFLOPs
25 upsample 2x 16 x 16 x 512 -> 32 x 32 x 512
26 route 13 25
27 conv 256 3 x 3 / 1 32 x 32 x 768 -> 32 x 32 x 256 3.624 BFLOPs
28 conv 256 3 x 3 / 1 32 x 32 x 256 -> 32 x 32 x 256 1.208 BFLOPs
29 conv 256 3 x 3 / 1 32 x 32 x 256 -> 32 x 32 x 256 1.208 BFLOPs
30 upsample 2x 32 x 32 x 256 -> 64 x 64 x 256
31 route 10 30
32 conv 128 3 x 3 / 1 64 x 64 x 384 -> 64 x 64 x 128 3.624 BFLOPs
33 conv 128 3 x 3 / 1 64 x 64 x 128 -> 64 x 64 x 128 1.208 BFLOPs
34 conv 128 3 x 3 / 1 64 x 64 x 128 -> 64 x 64 x 128 1.208 BFLOPs
35 upsample 2x 64 x 64 x 128 -> 128 x 128 x 128
36 route 7 35
37 conv 64 3 x 3 / 1 128 x 128 x 192 -> 128 x 128 x 64 3.624 BFLOPs
38 conv 64 3 x 3 / 1 128 x 128 x 64 -> 128 x 128 x 64 1.208 BFLOPs
39 conv 64 3 x 3 / 1 128 x 128 x 64 -> 128 x 128 x 64 1.208 BFLOPs
40 upsample 2x 128 x 128 x 64 -> 256 x 256 x 64
41 route 4 40
42 conv 32 3 x 3 / 1 256 x 256 x 96 -> 256 x 256 x 32 3.624 BFLOPs
43 conv 32 3 x 3 / 1 256 x 256 x 32 -> 256 x 256 x 32 1.208 BFLOPs
44 conv 32 3 x 3 / 1 256 x 256 x 32 -> 256 x 256 x 32 1.208 BFLOPs
45 upsample 2x 256 x 256 x 32 -> 512 x 512 x 32
46 route 1 45
47 conv 16 3 x 3 / 1 512 x 512 x 48 -> 512 x 512 x 16 3.624 BFLOPs
48 conv 16 3 x 3 / 1 512 x 512 x 16 -> 512 x 512 x 16 1.208 BFLOPs
49 conv 16 3 x 3 / 1 512 x 512 x 16 -> 512 x 512 x 16 1.208 BFLOPs
50 conv 2 1 x 1 / 1 512 x 512 x 16 -> 512 x 512 x 2 0.017 BFLOPs
Loading weights from models/table.weights...Done!
到这就结束啦,这是什么问题

编译问题

将table-ocr中的makefile拷贝make后出现下列错误
make: *** No rule to make target 'obj', needed by 'all'. Stop.

运行出错,求救!

Traceback (most recent call last):
File "table.py", line 209, in
rboxes,ColsLines,RowsLines = get_table_ceilboxes(img,prob=0.5,row=10,col=10,alph=10)
File "table.py", line 173, in get_table_ceilboxes
RowsLines,ColsLines=get_table_rowcols(img,prob,row,col)
File "table.py", line 94, in get_table_rowcols
xmin,xmax = indX.min(),indX.max()
File "/home/ubuntu/anaconda3/envs/ai/lib/python3.6/site-packages/numpy/core/_methods.py", line 32, in _amin
return umr_minimum(a, axis, None, out, keepdims, initial)
ValueError: zero-size array to reduction operation minimum which has no identity

OSError: [WinError 126] 找不到指定的模块。

求大神指导,这个报错是怎么回事
Traceback (most recent call last):
File "table.py", line 22, in
from darknet import load_net,predict_image,array_to_image
File "D:\table-ocr-master\table-ocr-master\darknet.py", line 52, in
lib = CDLL(darkRoot, RTLD_GLOBAL)
File "D:\python3\lib\ctypes_init_.py", line 356, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

关于模型训练方法

你好,我正在做一项相似的研究,恰好看到了您的项目,但是在项目中没有发现训练模型的方法,不知是否方便提供模型训练方法,感激不尽。我的邮箱是[email protected]

unet分割网络的训练

请问一下您能够提供一下Unet网络的训练地址吗?我试了您的表格横竖线的分割模型,但我想重新训练做一个三线表的分割,感激不尽~ 我的邮箱是 [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.