chineseocr / table-ocr Goto Github PK

Makefile 14.53% Python 85.47%

table-ocr's Introduction

本项目基于yolo3 与crnn 实现中文自然场景文字检测及识别

darknet 优化版本：https://github.com/chineseocr/darknet-ocr.git

训练代码（master分支）

ocr训练数据集

ocr ctc训练数据集(压缩包解码:chineseocr)
百度网盘地址:链接: https://pan.baidu.com/s/1UcUKUUELLwdM29zfbztzdw 提取码: atwn
gofile地址:http://gofile.me/4Nlqh/uT32hAjbx 密码 https://github.com/chineseocr/chineseocr

实现功能

环境部署

GPU部署参考:setup.md
CPU部署参考:setup-cpu.md

下载编译darknet(如果直接运用opencv dnn或者keras yolo3 可忽略darknet的编译)

git clone https://github.com/pjreddie/darknet.git 
mv darknet chineseocr/
##编译对GPU、cudnn的支持 修改 Makefile
#GPU=1
#CUDNN=1
#OPENCV=0
#OPENMP=0
make

修改 darknet/python/darknet.py line 48
root = '/root/'##chineseocr所在目录
lib = CDLL(root+"chineseocr/darknet/libdarknet.so", RTLD_GLOBAL)

下载模型文件

模型文件地址:

百度网盘:https://pan.baidu.com/s/1gTW9gwJR6hlwTuyB6nCkzQ
other-links:http://gofile.me/4Nlqh/fNHlWzVWo
复制文件夹中的所有文件到models目录

模型转换（非必须）

pytorch ocr 转keras ocr

python tools/pytorch_to_keras.py  -weights_path models/ocr-dense.pth -output_path models/ocr-dense-keras.h5

darknet 转keras

python tools/darknet_to_keras.py -cfg_path models/text.cfg -weights_path models/text.weights -output_path models/text.h5

keras 转darknet

python tools/keras_to_darknet.py -cfg_path models/text.cfg -weights_path models/text.h5 -output_path models/text.weights

模型选择

参考config.py文件

构建docker镜像

##下载Anaconda3 python 环境安装包（https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh） 放置在chineseocr目录下   
##建立镜像   
docker build -t chineseocr .   
##启动服务   
docker run -d -p 8080:8080 chineseocr /root/anaconda3/bin/python app.py

web服务启动

cd chineseocr## 进入chineseocr目录
python app.py 8080 ##8080端口号，可以设置任意端口

访问服务

http://127.0.0.1:8080/ocr

识别结果展示

参考

yolo3 https://github.com/pjreddie/darknet.git
crnn https://github.com/meijieru/crnn.pytorch.git
ctpn https://github.com/eragonruan/text-detection-ctpn
CTPN https://github.com/tianzhi0549/CTPN
keras yolo3 https://github.com/qqwweee/keras-yolo3.git
darknet keras 模型转换参考参考：https://www.cnblogs.com/shouhuxianjian/p/10567201.html
语言模型实现 https://github.com/lukhy/masr

table-ocr's People

Contributors

Stargazers

Watchers

Forkers

ticshen2019 ttyhu wyzhe cqray1990 dlml myhub qhduan icanfly777 caotianwei cyy0523xc liuhengguang littlehead27 yinmingjun ieee820 happog kapitsa2811 kernelforce qingqingsun coolmay hkksimple cherish24 xiaolang564321 liyucode yaoxinbin rocke2020 jingmouren davidce lxj0276 gaoxin627 qf6101 quiteboy wenxuwan technicalant alwc yangheng111 v-smwang chuckwoody zoujuny ftry baifanysu xuweidongkobe xxxxxxxiao cruil aiedward sesmond ygest awoziji wuzuowuyou kiciro ryuui-tkb wjyhumor chang87812 zhangxiao339 yangyin2016 qianrenjian hollisjoe beyondyourself monkeyfx changss fangaofeng xrosliang jiolen wuxiaolianggit jingwanli6666 jadentan kingoliang llf10811020205 zx4321 zhenqisong tomwwjjtt cctvbtx fuzi-team fightingyoung askintution bestjex johnson7788 las1374236892 xuweitj raymondzzq hpc203 sarah-leigh jamasbian zhanguochang nigaea zlszhonglongshen mppsk0 ponykid aurora11111 mymsimple lbw1320028474 chengjingd zhangxinnan yueyedeai liulei199409 zjxcc jotoy jasonhungrd zhengdeding magicsen ljqcn101

table-ocr's Issues

转tf，进行测试

您好，我把darknet的网络和权重拿出来之后。最后一层的输出是logit层么也就是未经logistic的，那么预测的时候
对'table/convolutional33/BiasAdd' 需要logistic么，还是类似函数 dnn_table_predict 里 exp的操作

训练数据问题

大佬你好，我试了下提供的表格线分割模型，效果非常好，请问模型训练的数据是什么，方便提供下吗，感恩不尽！

调用出错

ubuntu系统，执行 python3 table.py -jpgPath test/dd.jpg报:

Traceback (most recent call last):
File "table.py", line 22, in
from darknet import load_net,predict_image,array_to_image
File "/data/services/table-ocr/darknet.py", line 52, in
lib = CDLL(darkRoot, RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: ../darknet/libdarknet.so: cannot open shared object file: No such file or directory

能看看是怎么回事吗

不能安装opencv-contrib-python==4.0.0.21依赖，导致无法进行测试

不能安装opencv-contrib-python==4.0.0.21依赖，导致无法执行table.py。

在执行 “python3 table.py -jpgPath test/dd.jpg ”

检查了一下，发现缺少依赖，但是该依赖却无法装上去。想请教一下此问题能否又解决方案。

labelme 制作的mask图失真严重

原图为1600*2400大小，训练的时候需要将mask图resize，但是resize之后再显示出来，失真特别严重，这个问题作者遇到过没

表格单元的识别

你好，请教下，您是如何进行表格单元的标注和识别的？

请教下作者，UNET输出的结果是像素，像素到线条是怎么转换的呢？

按照我的理解，线条存储只需要起点终点坐标，那么用UNET输出的像素点是如何转换为起始点坐标的呢？

about label

when use labelme to label image , Is it a straight line or a thin box?

无表格线表格解析

支持无表格线表格的解析吗？

表格重建，输出excel

大佬你好，请问表格重建，输出json\excel的思路？

can this table be detected ？

表格坐标重建和OCR填充是否可以分两步实现，毕竟可以对接不同的OCR引擎

权重文件下载链接失效

找线时，表格会有缺失

Ubuntu 16.04
python3.7
无GPU，编译好了darknet，config也设置了darkRoot
在找线时，最后一列的表格会有缺失，请问要在这方法中修改参数rboxes,ColsLines,RowsLines = get_table_ceilboxes(img,prob=0.5,row=10,col=10,alph=10)
还有是和训练的模型有关系呢

楼主，make: *** No rule to make target 'obj', needed by 'all'. Stop.

楼主，出现了这个问题make: *** No rule to make target 'obj', needed by 'all'. Stop.

咨询下，是否可以把unet迁移到移动端？

咨询下，是否可以把unet迁移到移动端，或者有其他效果和unet在表格数据集分割效果相当的mobile网络吗？
求大佬解惑，或给个思路

-

模型文件下载失效，博主可以再发一个下载链接吗

楼主,请教一下运行 python3 table.py -jpgPath test/dd.jpg

batch: Using default '1'
learning_rate: Using default '0.001000'
momentum: Using default '0.900000'
subdivisions: Using default '1'
policy: Using default 'constant'
max_batches: Using default '0'
layer filters size input output
0
conv 16 3 x 3 / 1 512 x 512 x 3 -> 512 x 512 x 16 0.226 BFLOPs
1 conv 16 3 x 3 / 1 512 x 512 x 16 -> 512 x 512 x 16 1.208 BFLOPs
2 max 2 x 2 / 2 512 x 512 x 16 -> 256 x 256 x 16
3 conv 32 3 x 3 / 1 256 x 256 x 16 -> 256 x 256 x 32 0.604 BFLOPs
4 conv 32 3 x 3 / 1 256 x 256 x 32 -> 256 x 256 x 32 1.208 BFLOPs
5 max 2 x 2 / 2 256 x 256 x 32 -> 128 x 128 x 32
6 conv 64 3 x 3 / 1 128 x 128 x 32 -> 128 x 128 x 64 0.604 BFLOPs
7 conv 64 3 x 3 / 1 128 x 128 x 64 -> 128 x 128 x 64 1.208 BFLOPs
8 max 2 x 2 / 2 128 x 128 x 64 -> 64 x 64 x 64
9 conv 128 3 x 3 / 1 64 x 64 x 64 -> 64 x 64 x 128 0.604 BFLOPs
10 conv 128 3 x 3 / 1 64 x 64 x 128 -> 64 x 64 x 128 1.208 BFLOPs
11 max 2 x 2 / 2 64 x 64 x 128 -> 32 x 32 x 128
12 conv 256 3 x 3 / 1 32 x 32 x 128 -> 32 x 32 x 256 0.604 BFLOPs
13 conv 256 3 x 3 / 1 32 x 32 x 256 -> 32 x 32 x 256 1.208 BFLOPs
14 max 2 x 2 / 2 32 x 32 x 256 -> 16 x 16 x 256
15 conv 512 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 512 0.604 BFLOPs
16 conv 512 3 x 3 / 1 16 x 16 x 512 -> 16 x 16 x 512 1.208 BFLOPs
17 max 2 x 2 / 2 16 x 16 x 512 -> 8 x 8 x 512
18 conv 1024 3 x 3 / 1 8 x 8 x 512 -> 8 x 8 x1024 0.604 BFLOPs
19 conv 1024 3 x 3 / 1 8 x 8 x1024 -> 8 x 8 x1024 1.208 BFLOPs
20 upsample 2x 8 x 8 x1024 -> 16 x 16 x1024
21 route 16 20
22 conv 512 3 x 3 / 1 16 x 16 x1536 -> 16 x 16 x 512 3.624 BFLOPs
23 conv 512 3 x 3 / 1 16 x 16 x 512 -> 16 x 16 x 512 1.208 BFLOPs
24 conv 512 3 x 3 / 1 16 x 16 x 512 -> 16 x 16 x 512 1.208 BFLOPs
25 upsample 2x 16 x 16 x 512 -> 32 x 32 x 512
26 route 13 25
27 conv 256 3 x 3 / 1 32 x 32 x 768 -> 32 x 32 x 256 3.624 BFLOPs
28 conv 256 3 x 3 / 1 32 x 32 x 256 -> 32 x 32 x 256 1.208 BFLOPs
29 conv 256 3 x 3 / 1 32 x 32 x 256 -> 32 x 32 x 256 1.208 BFLOPs
30 upsample 2x 32 x 32 x 256 -> 64 x 64 x 256
31 route 10 30
32 conv 128 3 x 3 / 1 64 x 64 x 384 -> 64 x 64 x 128 3.624 BFLOPs
33 conv 128 3 x 3 / 1 64 x 64 x 128 -> 64 x 64 x 128 1.208 BFLOPs
34 conv 128 3 x 3 / 1 64 x 64 x 128 -> 64 x 64 x 128 1.208 BFLOPs
35 upsample 2x 64 x 64 x 128 -> 128 x 128 x 128
36 route 7 35
37 conv 64 3 x 3 / 1 128 x 128 x 192 -> 128 x 128 x 64 3.624 BFLOPs
38 conv 64 3 x 3 / 1 128 x 128 x 64 -> 128 x 128 x 64 1.208 BFLOPs
39 conv 64 3 x 3 / 1 128 x 128 x 64 -> 128 x 128 x 64 1.208 BFLOPs
40 upsample 2x 128 x 128 x 64 -> 256 x 256 x 64
41 route 4 40
42 conv 32 3 x 3 / 1 256 x 256 x 96 -> 256 x 256 x 32 3.624 BFLOPs
43 conv 32 3 x 3 / 1 256 x 256 x 32 -> 256 x 256 x 32 1.208 BFLOPs
44 conv 32 3 x 3 / 1 256 x 256 x 32 -> 256 x 256 x 32 1.208 BFLOPs
45 upsample 2x 256 x 256 x 32 -> 512 x 512 x 32
46 route 1 45
47 conv 16 3 x 3 / 1 512 x 512 x 48 -> 512 x 512 x 16 3.624 BFLOPs
48 conv 16 3 x 3 / 1 512 x 512 x 16 -> 512 x 512 x 16 1.208 BFLOPs
49 conv 16 3 x 3 / 1 512 x 512 x 16 -> 512 x 512 x 16 1.208 BFLOPs
50 conv 2 1 x 1 / 1 512 x 512 x 16 -> 512 x 512 x 2 0.017 BFLOPs
Loading weights from models/table.weights...Done!
到这就结束啦，这是什么问题

可以分割的线的种类？

是只能对可见的横线和竖线进行分割吗？能否分割不可见的横线和竖线

编译问题

将table-ocr中的makefile拷贝make后出现下列错误
make: *** No rule to make target 'obj', needed by 'all'. Stop.

table.py文件get_table_rowcols函数中行和列为什么分别选取了out[0]和out[1]?

作者您好，抱歉打搅，请教一下，问题如题。

编译出错

运行git clone https://github.com/pjreddie/darknet.git ../darknet cp Makefile ../darknet cd ../darknet && make
显示fatal：Too many arguments。
是命令有问题吗

请问训练时使用的数据集是什么？

运行出错，求救！

Traceback (most recent call last):
File "table.py", line 209, in
rboxes,ColsLines,RowsLines = get_table_ceilboxes(img,prob=0.5,row=10,col=10,alph=10)
File "table.py", line 173, in get_table_ceilboxes
RowsLines,ColsLines=get_table_rowcols(img,prob,row,col)
File "table.py", line 94, in get_table_rowcols
xmin,xmax = indX.min(),indX.max()
File "/home/ubuntu/anaconda3/envs/ai/lib/python3.6/site-packages/numpy/core/_methods.py", line 32, in _amin
return umr_minimum(a, axis, None, out, keepdims, initial)
ValueError: zero-size array to reduction operation minimum which has no identity

OSError: [WinError 126] 找不到指定的模块。

求大神指导，这个报错是怎么回事
Traceback (most recent call last):
File "table.py", line 22, in
from darknet import load_net,predict_image,array_to_image
File "D:\table-ocr-master\table-ocr-master\darknet.py", line 52, in
lib = CDLL(darkRoot, RTLD_GLOBAL)
File "D:\python3\lib\ctypes_init_.py", line 356, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。