chineseocr / table-detect Goto Github PK

View Code? Open in Web Editor NEW

233.0 233.0 86.0 1.27 MB

table detect(yolo) , table line(unet)

License: MIT License

Python 100.00%

table-detect table-line tensorflow2

table-detect's Introduction

本项目基于yolo3 与crnn 实现中文自然场景文字检测及识别

darknet 优化版本：https://github.com/chineseocr/darknet-ocr.git

训练代码（master分支）

ocr训练数据集

ocr ctc训练数据集(压缩包解码:chineseocr)
百度网盘地址:链接: https://pan.baidu.com/s/1UcUKUUELLwdM29zfbztzdw 提取码: atwn
gofile地址:http://gofile.me/4Nlqh/uT32hAjbx 密码 https://github.com/chineseocr/chineseocr

实现功能

环境部署

GPU部署参考:setup.md
CPU部署参考:setup-cpu.md

下载编译darknet(如果直接运用opencv dnn或者keras yolo3 可忽略darknet的编译)

git clone https://github.com/pjreddie/darknet.git 
mv darknet chineseocr/
##编译对GPU、cudnn的支持 修改 Makefile
#GPU=1
#CUDNN=1
#OPENCV=0
#OPENMP=0
make

修改 darknet/python/darknet.py line 48
root = '/root/'##chineseocr所在目录
lib = CDLL(root+"chineseocr/darknet/libdarknet.so", RTLD_GLOBAL)

下载模型文件

模型文件地址:

百度网盘:https://pan.baidu.com/s/1gTW9gwJR6hlwTuyB6nCkzQ
other-links:http://gofile.me/4Nlqh/fNHlWzVWo
复制文件夹中的所有文件到models目录

模型转换（非必须）

pytorch ocr 转keras ocr

python tools/pytorch_to_keras.py  -weights_path models/ocr-dense.pth -output_path models/ocr-dense-keras.h5

darknet 转keras

python tools/darknet_to_keras.py -cfg_path models/text.cfg -weights_path models/text.weights -output_path models/text.h5

keras 转darknet

python tools/keras_to_darknet.py -cfg_path models/text.cfg -weights_path models/text.h5 -output_path models/text.weights

模型选择

参考config.py文件

构建docker镜像

##下载Anaconda3 python 环境安装包（https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh） 放置在chineseocr目录下   
##建立镜像   
docker build -t chineseocr .   
##启动服务   
docker run -d -p 8080:8080 chineseocr /root/anaconda3/bin/python app.py

web服务启动

cd chineseocr## 进入chineseocr目录
python app.py 8080 ##8080端口号，可以设置任意端口

访问服务

http://127.0.0.1:8080/ocr

识别结果展示

参考

yolo3 https://github.com/pjreddie/darknet.git
crnn https://github.com/meijieru/crnn.pytorch.git
ctpn https://github.com/eragonruan/text-detection-ctpn
CTPN https://github.com/tianzhi0549/CTPN
keras yolo3 https://github.com/qqwweee/keras-yolo3.git
darknet keras 模型转换参考参考：https://www.cnblogs.com/shouhuxianjian/p/10567201.html
语言模型实现 https://github.com/lukhy/masr

table-detect's People

Contributors

Stargazers

Watchers

Forkers

bourne-m tomwwjjtt fuzi-team buyanfangqi sporterman cqray1990 jadentan yangyin2016 johnson7788 raymondzzq kingoliang sarah-leigh cyy0523xc dlml askintution dlove1204 1994sugar mymsimple intjun meismaomao yueyedeai beyondyourself zgsxwsdxg happog znsoftm allen1000 jasonhungrd upmao magicsen chros425 nsevc chenying99 zx4321 jsooooooo windyjune dennisgu sgerpguochao solofive caihong06302923 gehongpeng chenhaohan88 spacegithub chcorophyll diorw marsbzp fanqie03 sharetech finger-tip moyueheng tablerecognitionorg hellmo718 jacke121 dove-olive ursular86 hnn123 xinjianlv llf10811020205 logoyoung haixing-hu plumiron nothing4any baifanysu vincentwei2021 fardman69420 oliverkehl allensky708 yinwh79 fourmi1995 charygao aaferrero dorootdo wcyong sunxingxingtf eliauktm 18106574249 waysolong happyxy josephgu41 solegh eight-corner shiyong8101 cznc duyuankai1992 lithstudy clljf

table-detect's Issues

线条检测不准

训练出来之后，用训练的图片测试，线条检测准

但重新拍照检测又不准了，这个会是什么问题

权重文件下载链接失效

http://59.110.234.163:9990/static/models/table-detect/ 链接失效，能否将文件上传到百度网盘，多谢

Issue with accuracy and loss during training

Hello,

I tried training a model with my own data after annotating it using label-me. But while training I observed that my training loss is not converging nor is my accuracy increasing. It is stuck within a range of values (0.45 - 0.55 for accuracy). Any Idea why this is happening ?

Error

您好，请问下模型和cfg文件不匹配怎么解决？

训练数据集

尊敬的作者：
您好。
下载了您预训练好的表格线识别模型，效果挺好。想请教一下，您训练table-line这个模型，用的什么数据集。我看到您的代码目录里有5张用于训练表格线的图片及标注。难道就只是用这5张图片训练的吗？

test内存不释放，如何设置batch_size

infer的时候在哪设置batch_size的大小呢？现在的问题是test图片多内存会增加不释放，我用的cpu进行的推理。

关于数据集的问题

首先，非常感谢您的分享。我尝试了表格识别，然后后处理稍微修改一点，效果非常棒。
其中，如果我想迁移学习，有什么公开数据集可以训练unet么？
下载了TableBank和其他的，他们的标签要么作表格检测，要么是预测表格单元关系。
不会真的都要用 LabelImage制作segmentation img吧。手动狗头

请问标注的是直线，还是一个长条细小的矩形框呢？

无线表格

该模型是不支持无线表格的分割吧

unet主干网络sigmoid选用

作者您好，感谢您对于表格识别的杰出工作，我这里现在有个问题
classify = Conv2D(num_classes, (1, 1), activation='sigmoid')(up0a)
这里既然num_classes是2，多分类问题了，为啥选用sigmoid作为激活函数呢？

将增量训练完的.h5模型文件转换为.weights文件

如题，如何将增量训练完的.h5模型文件转换为.weights文件

你好，想请教下，我训练表格结构时高分辨率表格效果不是很好，有什么方法调整参数训练吗？以下是我附上我识别的结果

各位好：如何在train.py中的main 中添加指定使用GPU的代码，代码默认跑cpu

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
无效果，用的是pubtabnet数据集

请问表格外框的训练代码在哪里呢？或者请问训练示例里将图片截取到表格外框那是怎么做到的呢

看代码好像是table_line是表格内部线条的权重模型，table_detect.cfg和weight是识别表格外框的权重模型，这样的猜想是正确的吗？如果是的话请问下表格外框的训练代码在哪里呢？或者请问训练示例里将图片截取到表格外框那是怎么做到的呢，期待大佬解答

Datasource / Paper?

Hi,

I am keen to know if there is any research information available about the line detection model. Is the model based on a paper? How and where did you get the training data set?

关于分割

您好，请问下如果不做检测，直接分割，为什么会产生这样的列分割线？？无边框的也可以分割吗？

这个数据为啥这么标注呢

边界的线也没有标，横线和竖线标的并不是只是一根直线，直接标两点直线？

训练时横竖线的线宽设置成多少个像素合适？

看train.py中线宽设置为1个像素。

fix_table_box_for_table_line 这个函数的作用怎么理解

padx = (xmax - xmin) * (1 - prob) 这个公式的依据是什么呢

数据标注问题

大佬您好，我想问一下，数据标注的时候，我看tablenet分成了两类，如果横向和纵向重叠的时候，该如何分类呢 @ @wenlihaoyu

训练准确率上不去

请问一下，为什么训练自己的数据，准确率上不去，在0.5左右震荡？

检测到的横竖线中间被截断，或者整条横竖线不能被完全检测，只能检测到横竖线的部分线段如下图

标注了500张图片重新训练tableline模型，其中标注数据中有手拍表格图片，所以对于拍摄变形的图片中的表格横竖线，某一条横线如果只用一条横线标注，可能不能完全重合，因此一条横线可能被分成了两段或者三段横线进行标注。训练了10epoch，最终训练集上acc稳定到了0.99左右，loss0.019左右，但是用训练后的模型，在训练集上推理，检测到的横竖线效果很不好，同一条线段可能只能检测到线段上的部分，而不能完全检测整条线段。
请问是标注的训练数据的问题嘛，是不是训练数据的横竖线必须是横平

竖直那种横线，如果对于有拍摄畸变的图片，对于畸变横线标注时候也必须用一条横线标注，还是可以畸变横线被分成两段或者三段进行标注呢。
请问是训练数据太少呢，还是标注数据的问题呢还是训练epoch太少的原因呢，您给出的模型是用多少数据训练的呢

请问有没有其他方式下载模型权重啊？打不开http://gofile.me/4Nlqh/fNHlWzVWo这个网址

关于用自己训练数据，table_ceil.py报错

作者代码里train.py，训练是在table-line.h5 基础上迁移学习，我使用自己的数据，table_line.py,注释model.load_weights(tableModeLinePath)，训练出来的模型，在运行预测table_ceil.py，报错，运行table_line.py,不报错，但是预测图片没有任何线条
GeForce GTX 1080 Ti, pci bus id: 0000:00:06.0, compute capability: 6.1) 1 Physical GPUs, 1 Logical GPUs 2020-10-15 17:30:16.059743: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 Traceback (most recent call last): File "table_ceil.py", line 105, in <module> tableDetect = table(img) File "table_ceil.py", line 25, in __init__ self.table_ceil() ##表格单元格定位 File "table_ceil.py", line 71, in table_ceil ceilboxes[:, [0, 2, 4, 6]] += xmin IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

您好，代码非常赞，请问如果是手机拍照，折线，而非直线怎么标注呢？是标注n多个线段吗？这种能解决吗？

在tabel_line中，加载完模型后，model.predict(np.array([np.array(inputBlob) / 255.0]))计算时候内存增长特别大

表格横竖线检测时候，在tabel_line中，加载完模型后，model.predict(np.array([np.array(inputBlob) / 255.0]))计算时候内存增长特别大，但是得到结果后内存恢复，请问计算时候内存瞬间增长特别大甚至有时候增长0.7g正常嘛，求救

如何让检测到的表格，按行进行逐个排序，而不是乱序的

识别问题

检测完单元格后，对单元格的内容进行识别，会把表格的边框当成是文字进行识别，想问下这个问题怎么解决

大家运行train.py,有没有遇到下面的错误，h5文件放在model文件下了，解决方法是啥

OSError: Unable to open file (unable to open file: name = 'models/table-line.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

utils.py模块中的adjust_lines函数存在使得line_to_line函数报分母为0的BUG

table-detect/utils.py

Line 180 in 92488f3

r = sqrt((x2, y2), (x3, y3))

稳定触发报错的例子：
1.ColsLines[i] = (189，5，189，92)；ColsLines[j] = (202, 92, 202, 396)，即(x2,y2) = (189, 92)，(x3,y3) = (202, 92)，r = sqrt((x2, y2), (x3, y3)) < alph 成立
2.会执行newColsLines.append([x2, y2, x3, y3])，但此处显然[189,92,202,92]是一条横线
3.newColsLines列表里混入横线后，运行到line_to_line函数时，只要与[189,92,202,92]计算的另一条横线y值相同，就会导致fit_line时直线系数A为0

table-detect/utils.py

Line 228 in 92488f3

A = y2 - y1

4.最后导致分母"A1B2-A2B1"为0报错

table-detect/utils.py

Line 255 in 92488f3

x = (B1 * C2 - B2 * C1) / (A1 * B2 - A2 * B1)

5.同样adjust_lines函数第152，156，176行这三条分支都会触发上述错误

问题导致原因：
RowsLines循环时，两条row_line特殊情况会产生一条垂直的col_line，需要增加判断条件加到newColsLines而非newRowsLines，ColsLines循环同理。

解决方案：增加调整后新增线段是否水平垂直

# 其他情况同理
r = sqrt((x1, y1), (x4, y4))
delta_x = x1 - x4
if r < alph:
    if delta_x == 0.0:
        newColsLines.append([x1, y1, x4, y4])
    else:
        newRowsLines.append([x1, y1, x4, y4])

训练

感谢您的分享，请问有训练模型的工程吗？

chineseocr / table-detect Goto Github PK

table-detect's Introduction

本项目基于yolo3 与crnn 实现中文自然场景文字检测及识别

darknet 优化版本：https://github.com/chineseocr/darknet-ocr.git

训练代码（master分支）

ocr训练数据集

实现功能

环境部署

下载编译darknet(如果直接运用opencv dnn或者keras yolo3 可忽略darknet的编译)

下载模型文件

模型转换（非必须）

模型选择

构建docker镜像

web服务启动

访问服务

识别结果展示

参考

table-detect's People

Contributors

Stargazers

Watchers

Forkers

table-detect's Issues

OSError: Unable to open file (unable to open file: name = 'models/table-line.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Recommend Projects

Recommend Topics

Recommend Org