Coder Social home page Coder Social logo

textrecognitiondatagenerator's Introduction

TextRecognitionDataGenerator TravisCI codecov

A synthetic data generator for text recognition

What is it for?

Generating text image samples to train an OCR software. Now supporting non-latin text!

What do I need to make it work?

I use Archlinux so I cannot tell if it works on Windows yet.

Python 3.X
OpenCV 3.2 (It probably works with 2.4)
Pillow
Numpy
Requests
BeautifulSoup
tqdm

You can simply use pip install -r requirements.txt too.

New

  • Specify text color range using -tc min,max
  • Explicit alignement when using -al with fixed width (0: Left, 1: Center, 2: Right)
  • Fixed width using -wd
  • Generate random strings with letters, numbers and symbols (Thank you @FHainzl)
  • Save the labels in a file instead of in the file name (Thank you @FHainzl)
  • Add support for Simplified and Traditional Chinese

How does it work?

python run.py -w 5 -f 64

You get 1000 randomly generated images with random text on them like:

1 2 3 4 5

What if you want random skewing? Add -k and -rk (python run.py -w 5 -f 64 -k 5 -rk)

6 7 8 9 10

But scanned document usually aren't that clear are they? Add -bl and -rbl to get gaussian blur on the generated image with user-defined radius (here 0, 1, 2, 4):

11 12 13 14

Maybe you want another background? Add -b to define one of the three available backgrounds: gaussian noise (0), plain white (1), quasicrystal (2) or picture (3).

15 16 17 23

When using picture background (3). A picture from the pictures/ folder will be randomly selected and the text will be written on it.

Or maybe you are working on an OCR for handwritten text? Add -hw! (Experimental)

18 19 20 21 22

It uses a Tensorflow model trained using this excellent project by Grzego.

The project does not require TensorFlow to run if you aren't using this feature

You can also add distorsion to the generated text with -d and -do

23 24 25

The text is chosen at random in a dictionary file (that can be found in the dicts folder) and drawn on a white background made with Gaussian noise. The resulting image is saved as [text]_[index].jpg

There are a lot of parameters that you can tune to get the results you want, therefore I recommand checking out python run.py -h for more informations.

How to create images with Chinese (both simplified and traditional) text

It is simple! Just do python run.py -l cn -c 1000 -w 5!

Unfortunately I do not speak Chinese so you may have to edit texts/cn.txt to include some meaningful words instead of random glyphs.

Here are examples of what I could make with it:

Traditional:

27

Simplified:

28

Can I add my own font?

Yes, the script picks a font at random from the fonts directory.

fonts/latin English, French, Spanish, German
fonts/cn Chinese

Simply add / remove fonts until you get the desired output.

If you want to add a new non-latin language, the amount of work is minimal.

  1. Create a new folder with your language two-letters code
  2. Add a .ttf font in it
  3. Edit run.py to add an if statement in load_fonts()
  4. Add a text file in dicts with the same two-letters code
  5. Run the tool as you normally would but add -l with your two-letters code

It only supports .ttf for now.

Benchmarks

  • Intel Core i7-4710HQ @ 2.50Ghz + SSD (-c 1000 -w 1)
    • -t 1 : 363 img/s
    • -t 2 : 694 img/s
    • -t 4 : 1300 img/s
    • -t 8 : 1500 img/s
  • AMD Ryzen 7 1700 @ 4.0Ghz + SSD (-c 1000 -w 1)
    • -t 1 : 558 img/s
    • -t 2 : 1045 img/s
    • -t 4 : 2107 img/s
    • -t 8 : 3297 img/s

Contributing

  1. Create an issue describing the feature you'll be working on
  2. Code said feature
  3. Create a pull request

Feature request & issues

If anything is missing, unclear, or simply not working, open an issue on the repository.

What is left to do?

  • Better background generation
  • Better handwritten text generation
  • More customization parameters (mostly regarding background)

textrecognitiondatagenerator's People

Contributors

belval avatar fhainzl avatar wangershi avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.