Coder Social home page Coder Social logo

onexuan / advancedeast Goto Github PK

View Code? Open in Web Editor NEW

This project forked from huoyijie/advancedeast

0.0 2.0 0.0 3.23 MB

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.

License: MIT License

Python 100.00%

advancedeast's Introduction

AdvancedEAST

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST:An Efficient and Accurate Scene Text Detector, and the significant improvement was also made, which make long text predictions more accurate. If this project is helpful to you, welcome to star. And if you have any problem, please contact me.

advantages

  • writen in keras, easy to read and run
  • base on EAST, an advanced text detect algorithm
  • easy to train the model
  • significant improvement was made, long text predictions more accurate.(please see 'demo results' part bellow, and pay attention to the activation image, which starts with yellow grids, and ends with green grids.)

In my experiments, AdvancedEast has obtained much better prediction accuracy then East, especially on long text. Since East calculates final vertexes coordinates with weighted mean values of predicted vertexes coordinates of all pixels. It is too difficult to predict the 2 vertexes from the other side of the quadrangle. See East limitations picked from original paper bellow. East limitations

project files

  • config file:cfg.py,control parameters
  • pre-process data: preprocess.py,resize image
  • label data: label.py,produce label info
  • define network network.py
  • define loss function losses.py
  • execute training advanced_east.py and data_generator.py
  • predict predict.py and nms.py

network arch

  • AdvancedEast

AdvancedEast network arch

  • East

East network arch

setup

  • python 3.6.3+
  • tensorflow-gpu 1.5.0+(or tensorflow 1.5.0+)
  • keras 2.1.4+
  • numpy 1.14.1+
  • tqdm 4.19.7+

training

  • tianchi ICPR dataset download 链接: https://pan.baidu.com/s/1NSyc-cHKV3IwDo6qojIrKA 密码: ye9y

  • prepare training data:make data root dir(icpr), copy images to root dir, and copy txts to root dir, data format details could refer to 'ICPR MTWI 2018 挑战赛二:网络图像的文本检测', Link

  • modify config params in cfg.py, see default values.

  • python preprocess.py, resize image to 256256,384384,512512,640640,736*736, and train respectively could speed up training process.

  • python label.py

  • python advanced_east.py

  • python predict.py -p demo/001.png, to predict

  • pretrain model download 链接: https://pan.baidu.com/s/11rNLfNJ3bI4d500--uqR4A 密码: khk1

demo results

001原图 001激活图 001预测图

004原图 004激活图 004预测图

005原图 005激活图 005预测图

  • compared with east based on vgg16

As you can see, although the text area prediction is very accurate, the vertex coordinates are not accurate enough.

001激活图 001预测图

License

The codes are released under the MIT License.

references

刚刚接触深度学习,有些地方理解还不够深入,可能会有一些错误,请大家多多包涵指正:)

advancedeast's People

Contributors

huoyijie avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.