mohuishou / imageocr Goto Github PK

View Code? Open in Web Editor NEW

272.0 14.0 68.0 99 KB

PHP验证码识别[PHP CAPTCHA Recognition]

License: MIT License

PHP 100.00%

php ocr image imageocr captcha

imageocr's Introduction

ImageOCR

php 验证码识别库，对于非粘连字符具有很好的识别效果，对于一般粘连字符也能有较为良好的识别除噪算法支持孤立点除杂和连通域除噪，分割算法支持等宽分割、连通域分割以及滴水算法分割

示例效果

Install

composer require mohuishou/image-ocr

使用方法

例子详见 example

use docker

docker run --rm -p 8088:8088 mohuishou/image-ocr

点击 http://localhost:8088 查看效果

大致流程：

初始化 -> 灰度化 ---> 二值化 ---> 除噪点 -> 分割 -> 标准化 -> 识别

初始化

对象初始化

$image=new Image($img_path);
$image_ocr=new ImageOCR($image)

初始化二值化阈值

$image_ocr->setMaxGrey(90);
$image_ocr->setMinGrey(10);

初始化标准化图片宽高

$image_ocr->setStandardWidth(13);
$image_ocr->setStandardHeight(20);

开启 Debug

$image_ocr->setDebug(true);

灰度化

try{
    $image_ocr->grey();
}catch (\Exception $e){
    echo $e->getMessage();
}

二值化

注意：这一步的前提是需要先执行上一步灰度化，不然会抛出一个错误

try{
    $image_ocr->hash($max_grey=null,$min_grey=null);
}catch (\Exception $e){
    echo $e->getMessage();
}

二值化支持两种方式，第一种$image_ocr->hash($max_grey=null,$min_grey=null)即为上面那种固定的阈值范围，第二种为hashByBackground($model=self::MAX_MODEL,$max_grey=null,$min_grey=null)，通过背景图像的灰度值，动态取阈值，支持三种模式MAX_MODEL,MIN_MODEL,BG_MODEL分别是最大值、最小值和背景模式，最大值模式会用背景的灰度值替换阈值的上限，最小值模式替换下限，背景模式上下限都替换，即为只去除背景

除噪点

前置条件为二值化

孤立点除噪法

try{
    $image_ocr->removeSpots();
}catch (\Exception $e){
    echo $e->getMessage();
}

连通域除噪法

[如果要使用连通域分割法，可以跳过连通域除噪点，分割的同时可以一并除噪]

try{
    //使用之前需要初始化连通域对象
    $image_ocr->setImageConnect();
    //除噪
    $image_ocr->removeSpotsByConnect();
}catch (\Exception $e){
    echo $e->getMessage();
}

分割

非粘连字符串

连通域分割法

try{
    //使用之前需要初始化连通域对象
    $image_ocr->setImageConnect();
    //分割
    $image_ocr->splitByConnect();
}catch (\Exception $e){
    echo $e->getMessage();
}

粘连字符串

滴水算法分割

TODO: 待测试

标准化

try{
    $standard_data=$image_ocr->standard();
}catch (\Exception $e){
    echo $e->getMessage();
}

识别

TODO:待完善

API

ImageOCR::__construct(Image $image)
ImageOCR::saveImage($path)
ImageOCR::grey()
ImageOCR::hash($max_grey=null,$min_grey=null)
ImageOCR::hashByBackground($model=self::MAX_MODEL,$max_grey=null,$min_grey=null)
ImageOCR::removeSpots()
ImageOCR::removeSpotsByConnect()
ImageOCR::standard()
ImageOCR::setImageConnect()
ImageOCR::setImage(Image $image)
ImageOCR::getStandardData()
ImageOCR::setMaxGrey($max_grey)
ImageOCR::setMinGrey($min_grey)
ImageOCR::setStandardWidth($standard_width)
ImageOCR::setStandardHeight($standard_height)

//ImageTool的方法均为静态方法
ImageTool::removeZero($data)
ImageTool::removeZeroColumn($hash_data)
ImageTool::drawBrowser($data)
ImageTool::transposeAndRemoveZero($hash_data)
ImageTool::hashTranspose($hash_data)
ImageTool::img2hash($img)
ImageTool::hash2img($hash_data,$padding=0)

CHANGELOG

0.2 [2017-4-1]

0.1 [2016-10-7]

默认模板保存方式由数据库改为文件，保存路径为./db/db.json
使用 composer 安装

imageocr's People

Contributors

Stargazers

Watchers

Forkers

zzqss coffeehb gaohuazi lidonghui-ht anihy thobian rty813 zhenghui-z lzpfmh ken-studio lyw007 3tinkers zhangxuan0608 ranzizhou yvlf jackfinal zhijian01 bzboy bearphps 13567370952 26597925 royalwang test3535 vandj conglei1981 roczyl 0377 kyle946 xbw12138 zhfin ewang1986 81724097 chuchujie cruelwolfking sususweet wangqiaozi dingpanyue wangmingjob luqide sagittarius-zhijie beads123 chenzhigang9521 leonzhang2008 qfz9527 itrondi beroft chunyu-zhou wuwx soul-key kenhom acoderhz dtpark q409640976 gc888 ddwhrs hsqduron tangyouhang gaoruiqiang3003 zhuomingliang bozorkom zhensjoke kento996 caoy123 sm-cheng bambooeric mdys ayuday

imageocr's Issues

PHP Warning: exif_imagetype(): stream does not support seeking in

what wrong with is? Is some extension problem?
➜ image-ocr php study.php
PHP Warning: exif_imagetype(): stream does not support seeking in /Users/dynamo/PhpstormProjects/hack/vendor/mohuishou/image-ocr/Image.php on line 40

Warning: exif_imagetype(): stream does not support seeking in /Users/dynamo/PhpstormProjects/hack/vendor/mohuishou/image-ocr/Image.php on line 40

example/index.php

Fatal error: Class 'Mohuishou\ImageOCR\Example\OCR' not found

识别不了数字7

识别不了数字7,其它的数字准确率挺高

运行时出现这个问题

Warning: require_once(vendor/autoload.php): failed to open stream: No such file or directory 。
下载安装后没有这个文件啊

字符分割算法优化

将竖直分割改为基于连通域分割并且结合滴水算法，对粘连字符进行分割

建议

我最近也需要一个php的验证码识别项目，我现在在测试您的代码，我发现Image的rgb2grey方法并不能达到转灰度图的效果，是我测试代码问题吗？建议在工具类中添加一个show方法，可以直接显示出某一步操作之后的图像，比如ImageTool::show($image->rgb2grey())这样可以直接显示出灰度图，另外，我测试时用于显示图像的代码
` //计算图片的宽度与高度
$img_w=count($array_data[0]);
$img_h=count($array_data);

    //图像初始化
    $img = imagecreatetruecolor($img_w,$img_h);//创建一幅真彩色图像
    $white=imagecolorallocate($img, 255, 255, 255);//白色
    $black=imagecolorallocate($img, 0, 0, 0);//黑色

    //背景填充为白色
    imagefill($img, 0,0, $white);

    //进行画图
    for($h=0;$h<$img_h;$h++){
        for($w=0;$w<$img_w;$w++){
            imagesetpixel($img, $w,$h, $array_data[$h][$w]);
        }
    }

    return $img;`

可以识别那种以数学计算公式来验证的吗？

能识别那种数学计算的验证码吗

how use setup.php?

您好，请问！db数据库做哪里下呢？运行setup.php 也不行，这个链接http://www.169ol.com/Stream/Code/getCode也打不开呢？

写了个自动训练脚本训练了一晚上之后变成这样了

就是生成验证码图片，然后将验证码文本作为输入内容提交给 OCR 进行学习
挂着运行了一晚上，训练了 97350 张图片，就变这样了……
db 文件已经 105MB 了

使用例子识别验证码访问提示请先获取分割之后的图片

使用例子识别验证码访问提示请先获取分割之后的图片，请问怎么解决？

无法安装，请求帮助

不太清楚tp5的逻辑，你可以先git clone项目到本地，然后composer install，然后试试进入本地文件夹运行php -S localhost:8080，看看示例能不能运行，然后具体使用的时候可以参考example来写，因为每个验证码识别的参数和步骤都要进行一定的调整，所以具体逻辑需要自己完善

===================================================

按照上述指示，我在服务器上完成了
git clone https://github.com/mohuishou/ImageOCR.git
然后
cd ImageOCR/
然后
composer install
以上操作都能正常完成
然后
php -S localhost:8080
提示已经监听
Listening on http://localhost:8080
Document root is /var/www/html/ocr/ImageOCR

然后
wget http://localhost:8080
提示
404 Not Found

然后
wget http://localhost:8080/example
提示
500 Internal Server Error

麻烦了

    //开启的debug模式
    $this->image_ocr->setDebug(true);
    //初始化
    $this->image_ocr->setMaxGrey(90);
    $this->image_ocr->setMinGrey(130); // 130  
    $this->image_ocr->setStandardWidth(130);
    $this->image_ocr->setStandardHeight(50);
}

错误如下：

请先获取分割之后的图片
[2] ErrorException in OCR.php line 128
Invalid argument supplied for foreach()