Coder Social home page Coder Social logo

bigppwong / idcardocr Goto Github PK

View Code? Open in Web Editor NEW
404.0 15.0 149.0 10.24 MB

离线环境下第二代居民身份证信息识别

License: GNU General Public License v3.0

Python 98.44% Dockerfile 1.56%
docker idcard ocr python linux opencv tesseract-ocr

idcardocr's Issues

运行实例时报错

执行idcard_recognize.process('testimages/1.jpg')时报错:
OpenCV(3.4.2) /io/opencv/modules/imgproc/src/resize.cpp:4045: error: (-215:Assertion failed) !dsize.empty() || (inv_scale_x > 0 && inv_scale_y > 0) in function 'resize'。
请问是什么原因呀?

python3 idcardocr.py ,error

python3 idcardocr.py
0.3333333333333333 1280
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
(If you have multiple ICDs installed and OpenCL works, you can ignore this message)
进入身份证光学识别流程...
Traceback (most recent call last):
File "idcardocr.py", line 527, in
idocr = idcardocr(cv2.UMat(cv2.imread('./testimages/1.jpg')))
File "idcardocr.py", line 31, in idcardocr
result_dict['sex'] = get_sex(sex_pic)
File "idcardocr.py", line 310, in get_sex
return get_result_fix_length(red, 1, 'sex', '-psm 10')
File "idcardocr.py", line 423, in get_result_fix_length
result_string += pytesseract.image_to_string(cv2.UMat.get(color_img)[y:y + h, x:x + w], lang=langset, config=custom_config)
File "/home/ddc/.local/lib/python3.5/site-packages/pytesseract/pytesseract.py", line 294, in image_to_string
return run_and_get_output(*args)
File "/home/ddc/.local/lib/python3.5/site-packages/pytesseract/pytesseract.py", line 202, in run_and_get_output
run_tesseract(**kwargs)
File "/home/ddc/.local/lib/python3.5/site-packages/pytesseract/pytesseract.py", line 178, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v3.04.01 with Leptonica Error opening data file /usr/share/tesseract-ocr/tessdata/sex.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'sex' Tesseract couldn't load any languages! Could not initialize tesseract.')

运行后遇到的bug

首先谢谢大佬知识分享,我下载安装了docker,项目成功启动。
1、出现乱码,使用curl请求成功,但返回乱码都是\u开头的字符串,我检查了容器支持的字符编码, C、C.UTF-8、POSIX,没有普遍的zh_CN.UTF-8字符编码,可能应为这个造成返回异常。
2、某些识别异常,postman请求了服务,发现一部分项目图片性别和名族有识别为“又”的情况。
3、被识别图片的尺寸规格是否有要求。
以上是我使用心得,如能赐教,非常感谢。

示例中暴露隐私

看到您的repo感到十分感兴趣!但是同时发现您的postmen中暴露了个人信息,提醒下(233

请教

你好! 安装完依赖库后在解压的源码目录下命令行执行如下:

import idcard_recognize
0.3333333333333333 1280
print (idcard_recognize.process('testimages/3.jpg'))
[ INFO:0] Initialize OpenCL runtime...
integer argument expected, got float
{'error': 1}
没有其他提示, 我不知道该怎么定位错误, 请指点!多谢!

docker 部署启动后,访问报错

我的服务器:
CentOS Linux release 7.6.1810 (Core)
Linux CentOS 3.10.0-693.2.2.el7.x86_64 #1 SMP Tue Sep 12 22:26:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

docker log 输出

{'boundary': '----WebKitFormBoundaryUUqfbFmE5SfBknvh'}
进入身份证模版匹配流程...

使用postman测试
TIM截图20190513200708

最终的返回结果不包含身份证末位的'X'

图片 testimages/15.jpg 中的身份证号末位含有一个X符号,但在程序的最终返回结果(中间的打印结果是有X符号的)并没有包含这个X符号。我调试时,发现应该是idcardocr.py中的prunc_filter方法的正则表达式,没有把X符号考虑进去,如图。
image

下面是我的测试代码:

import idcard_recognize;
print(idcard_recognize.process('testimages/15.jpg'));

下面是输出:

0.3333333333333333 1280
进入身份证模版匹配流程...
查找身份证耗时:1553
进入身份证光学识别流程...
name
姜璐
sex

nation

address
辽宁省大连市甘井子区海
茂路807号1-4一4
idnum
21021119821218141X --  这里是有X符号的
{'sex': '', 'name': '姜璐', 'error': 0, 'nation': '', 'birth': '19821218', 'address': '辽宁省大连市甘井子区海茂路807号14一4', 'idnum': '21021119821218141'} -- 这里X符号就不见了

tessData

老铁,可以提供训练好的tessdata吗?

无法识别身份证中的性别和民族

非常好的一个库,但是我在使用时,发现所有testimages下的图片,都无法识别出性别和民族,请问是什么原因?下面是我的代码。

import idcard_recognize;
print(idcard_recognize.process('testimages/3.jpg'));

下面是输出

0.3333333333333333 1280
进入身份证模版匹配流程...
查找身份证耗时:664
进入身份证光学识别流程...
name
张岩
sex

nation

address
福建省南平市延平区黄墩
排垅巷21幢2室
idnum
350702198311280319
{'idnum': '350702198311280319', 'nation': '', 'birth': '19831128', 'sex': '', 'error': 0, 'name': '张岩', 'address': '福建省南平市延平区黄墩排垅巷21幢2室'}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.