Coder Social home page Coder Social logo

songdejia / east Goto Github PK

View Code? Open in Web Editor NEW
574.0 13.0 153.0 8.33 MB

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

License: MIT License

Shell 0.01% Python 99.98% Makefile 0.02%
deeplearning textdetection east pytorch ocr icdar

east's Introduction

EAST: An Efficient and Accurate Scene Text Detector

Description:

This version will be updated soon, please pay attention to this work. The motivation of this version is to build a easy-training model. This version can automatically update best_model by comparing current hmean and the former. At the same time, we can see evaluation info about every sample easily.

  • 1.train
  • 2.predict
  • 3.compress
  • 4.compute Hmean(if Hmean is higher than before, update best_weight.pkl)
  • 5.visualization(blue, green, red)
  • 6.multi-scale test (update soon) multi-scale vis. (vis with score, scales)

Thanks

The version is ported from argman/EAST, from Tensorflow to Pytorch

Check On Website

If you have no confidence of the result of our program, you could use submit.zip to submit on website,then you can see result of every image.

Performance

  • right -- green || wrong -- red || miss -- blue visualization visualization

  • recall/precision/hmean for every test image hmean

Introduction

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector. The features are summarized blow:

  • Only RBOX part is implemented.
  • A fast Locality-Aware NMS in C++ provided by the paper's author.(g++/gcc version 6.0 + will be ok)
  • Evalution see here for the detailed results.
  • Differences from original paper
    • Use ResNet-50 rather than PVANET
    • Use dice loss (optimize IoU of segmentation) rather than balanced cross entropy
    • Use linear learning rate decay rather than staged learning rate decay

Thanks for the author's (@zxytim) help! Please cite his paper if you find this useful.

Contents

  1. Installation
  2. Download
  3. Prepare dataset/pretrain
  4. Test
  5. Train
  6. Examples

Installation

  1. Any version of pytorch version > 0.4.0 should be ok.

Download

  1. Pretrained model is not provided temporarily. Web site is updating now, please continue to pay attention

Prepare dataset/pretrain weight

[1]. dataset(you need to prepare for dataset for train and test) suggestions: you could do a soft-link to root_to_this_program/dataset/train/img/*.jpg

  • -- train ./dataset/train/img/img_###.jpg ./dataset/train/gt/img_###.txt (you need to change name)
  • -- test ./data/test/img_###.jpg (img only)
  • -- gt.zip ./result/gt.zip(ICDAR15 gt.zip is avaliable on website

** Note: you can download dataset here

[2]. pretrained

  • In config.py set resume True and set checkpoint path/to/weight/file
  • I will provide pretrianed weight soon

[3]. check GPUs and CPUs you can use following to check aviliable gpu, this is for train

watch -n 0.1 nvidia-smi

then, you will see 2,3 is avaliable, modify config.py gpu_ids = [0,1], gpu = 2, and modify run.sh - CUDA_VISIBLE_DEVICES=2,3

Train

If you want to train the model, you should provide the dataset path in config.py and run

sh run.py

** Note: you should modify run.sh to specify your gpu id

If you have more than one gpu, you can pass gpu ids to gpu_list(like gpu_list=0,1,2,3) in config.py

** Note: you should change the gt text file of icdar2015's filename to img_*.txt instead of gt_img_*.txt(or you can change the code in icdar.py), and some extra characters should be removed from the file. See the examples in training_samples/**

Test

By default, we set train-eval process into integer. If you want to use eval independently, you can do it by yourself. Any question can contact me.

Examples

Here are some test examples on icdar2015, enjoy the beautiful text boxes! image_1 image_2 image_3 image_4 image_5

east's People

Contributors

songdejia avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

east's Issues

Train East.pytorch

Hello, I tried to train on ICDAR with your East's pytorch version, iterating nearly 8,000 epoch, but found that the effect is particularly bad. Is there any skill in training? I have tried other pytorch versions of East and found that the effect is not good. I can't find the reason until now. Can you give me some advice?

compile error (lanms)

g++: error: unrecognized command line option ‘-fno-plt’
Makefile:10: recipe for target 'adaptor.so' failed
make: *** [adaptor.so] Error 1

Not able to find gt.zip on the given website

HI,

I am trying to make the EAST algorithm work and I am training on ICDAR 2015 dataset. However, I am not able to find the gt.zip file and also not able to understand if it only contains the ground_truths for train/test images. Can someone throw some light on this and help me understand this?

My code breaks at the point where it says There is no gt.zip(obviously) as I dont have the zip and dont know how to make one.

where is the run.py

l have download your code,then change the dataset path,but i couldn't find where is run.py In your readme.md, you have noticed as fllows:
If you want to train the model, you should provide the dataset path in config.py and run

sh run.py

trained model

Dear all,

is it possible to have the trained model?

Thank you,

Cheers

可以多GPU同时训练吗?

我想多GPU同时训练,该怎么修改代码?
from keras.utils import multi_gpu_model
parallel_model = multi_gpu_model(east_network, gpus=4
加入这兩行代码,發現報錯,謝謝

Pretrained model

Hi, great repo !
It would be very helpful if a pretrained model will be provided along with the code, when such model will be available ?

Thanks in advance,
Arseny

ImportError: bad magic number in 'geo_map_cython_lib': b'\x03\xf3\r\n'

the detail about the error is :Traceback (most recent call last):
File "/home/user/east/EAST-pytorch/main.py", line 11, in
from data_utils import custom_dset, collate_fn
File "/home/user/east/EAST-pytorch/data_utils.py", line 18, in
from geo_map_cython_lib import gen_geo_map
i search a lot , but not solved this error,so how to solve the error to run the code correctly?thanks..

TypeError: string indices must be integers

when i run run.sh, i met a problem.
hmean.py, line 29, in compute_hmean
recall = resDict['method']['recall']
TypeError: string indices must be integers
can somenone help me?

网络参数初始化

在utils/init.py 的第10行的classname.find('conv') 应该替换成classname.find('Conv2d')

License uses

I notice that your repository (especially data_util.py and the lanms library) is based on argman/EAST, which is using GPLv3 license.

GPLv3 license is NOT compatible with the MIT license you are currently using. Please change a license which is compatible with GPLv3 or replace relevant libraries in your repository to avoid any copyright issues.

Shape of the Output F-Score

Hi,
The F-Score output by the EAST model is of the shape (W/4, H/4) where W and H are width and height of the input image respectively. Shouldn't the F-Score be a per pixel score, and so shouldn't its dimension be (W, H) instead?

(I know this repo works so perhaps there is a big gap in my understanding of the code. Kindly help.)

Thanks

Exception continueException in getitem

After the pre-trained, I want to train the model on my own datasets, and I prepared my training set according to the requirements in the README. But when I run the run.py , problems occured:
EAST <==> Prepare <==> Network <==> Done
Exception continue
Exception in getitem, and choose another index:133
Exception continue

So how could I solve this problem? Thank you.

How to label a image properly

I know we have to take the blank between one word and another into consideration while labelling a image. If there are two separated words, how do I decide when to put the two words into one label or to label the words separately. And when it comes to a whole sentence, how should I label a image with distortions, should I split the sentence into several parts and label all the parts separately into a quadrangle? Thanks a million :)

nan during training.

Hi @songdejia, thanks for trying to port EAST from tensorflow. But while trying to train this model on COCO 2014 or Oxford syn text, I get nan during training. Any ideas?

Please see below training Log:

Cross point does not exist
point dist to line raise Exception
point dist to line raise Exception
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
point dist to line raise Exception
point dist to line raise Exception
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
point dist to line raise Exception
point dist to line raise Exception
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
point dist to line raise Exception
point dist to line raise Exception
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
Cross point does not exist
Exception continue
Exception in getitem, and choose another index:4393
EAST <==> TRAIN <==> Epoch: [0][1/227] Loss 0.0231 Avg Loss 0.0250)

EAST <==> TRAIN <==> Epoch: [0][2/227] Loss 0.0282 Avg Loss 0.0260)

EAST <==> TRAIN <==> Epoch: [0][3/227] Loss 0.0313 Avg Loss 0.0273)

EAST <==> TRAIN <==> Epoch: [0][4/227] Loss 0.0271 Avg Loss 0.0273)

EAST <==> TRAIN <==> Epoch: [0][5/227] Loss 0.0206 Avg Loss 0.0262)

EAST <==> TRAIN <==> Epoch: [0][6/227] Loss 0.0300 Avg Loss 0.0267)

EAST <==> TRAIN <==> Epoch: [0][7/227] Loss 0.0239 Avg Loss 0.0264)

EAST <==> TRAIN <==> Epoch: [0][8/227] Loss 0.0271 Avg Loss 0.0265)

EAST <==> TRAIN <==> Epoch: [0][9/227] Loss 0.0284 Avg Loss 0.0266)

EAST <==> TRAIN <==> Epoch: [0][10/227] Loss 0.0197 Avg Loss 0.0260)

EAST <==> TRAIN <==> Epoch: [0][11/227] Loss nan Avg Loss nan)

EAST <==> TRAIN <==> Epoch: [0][12/227] Loss nan Avg Loss nan)

lacks of L2 regularzation

Hi bro,I read the code of yours and the src tf version,and I use your code to train but found can get 0.4 hmean on ic2015 test dataset,and I found that in your implemention,the network lacks L2 regularzation while the tf version has a 1e-5 L2 loss in the total loss.

python version

Could you please tell me what python version you use in this repo?

LOSS 始终是0.01

您好!利用自己的数据集,数据格式为 x1, y1, x2, y2, x3, y3, x4, y4, "###"
训练的时候loss始终是0.01,具体如下:
image
EAST <==> TRAIN <==> Epoch: [0][372/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][373/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][374/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][375/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][376/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][377/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][378/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][379/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][380/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][381/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][382/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][383/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][384/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][385/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][386/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][387/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][388/430] Loss 0.0100 Avg Loss 0.0100)

EAST <==> TRAIN <==> Epoch: [0][389/430] Loss 0.0100 Avg Loss 0

Implementation of QUAD part of the paper

How do you modify the geometry map generation for QUAD part of the paper? what does it mean by the statement "For the QUAD ground truth, the value of each pixel with positive score in the 8-channel geometry map is its coordinate shift from the 4 vertices of the quadrangle" I would like to know all the modification needed to be made in the code for implementing for QUAD part. How to modify the geometry map generation for QUAD method ? @songdejia

Where to get the gt.zip

I found the code need use the gt.zip to measure the accuracy ,but i don't find the gt.zip in the link given by you ,could you please tell me how to do this?Thanks.

Predict problem

How can i predict one image, and can you provide the trained model like the version of tensorflow ?

UnsupportedOperation: not writable

捕获

runfile('C:/Users/陈/Desktop/EAST-master/run_demo_server.py', wdir='C:/Users/陈/Desktop/EAST-master')
Traceback (most recent call last):

File "", line 1, in
runfile('C:/Users/陈/Desktop/EAST-master/run_demo_server.py', wdir='C:/Users/陈/Desktop/EAST-master')

File "D:\huanjingdajian\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "D:\huanjingdajian\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/陈/Desktop/EAST-master/run_demo_server.py", line 226, in
main()

File "C:/Users/陈/Desktop/EAST-master/run_demo_server.py", line 223, in main
app.run('0.0.0.0', args.port)

File "D:\huanjingdajian\lib\site-packages\flask\app.py", line 938, in run
cli.show_server_banner(self.env, self.debug, self.name, False)

File "D:\huanjingdajian\lib\site-packages\flask\cli.py", line 629, in show_server_banner
click.echo(message)

File "D:\huanjingdajian\lib\site-packages\click\utils.py", line 259, in echo
file.write(message)

UnsupportedOperation: not writable

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.