Coder Social home page Coder Social logo

hyunsungk / doc-homography-generator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nikolai10/doc-homography-generator

0.0 0.0 0.0 7.44 MB

Synthetic Dataset Generation: Recovering Homography from Camera Captured Documents

License: MIT License

Python 100.00%

doc-homography-generator's Introduction

DocHomographyGenerator (Python2.7)

Keywords: OCR, Page Dewarping, Synthetic Dataset Generation, Deep Learning

Synthetic Dataset Generation

This work is based on Recovering Homography from Camera Captured Documents using Convolutional Neural Networks (2017) and aims to provide a synthetic dataset producer/ generator (using various data augmentation methods), which allow to estimate the corner displacement vectors of the distorted document image.

distorted document images:

dewarped document images (first row):

1. Introduction

Capturing document images is a common way for digitizing physical documents due to the ubiquitousness of smartphones. In contrast to scans from a flatbed scanner, camera captured documents require a more sophisticated processing pipeline, because of perspective distorted images. In order to restore (dewarp) the original document image, one computes the Homography (3x3 matrix), that maps the corner points of the document image to its canonical position. However, estimating the params of the Homography matrix directly from one single input image is difficult, see.

An alternative way of computing H, is the 4pts method (see chapter findHomography Camera Calibration and 3D Reconstruction). Having 4 corresponding coplanar points, the distorted image can be unwarped the following way:

    #  Calculate Homography
    h, status = cv2.findHomography(pts_src, pts_dst)

    #  Warp source image to destination
    return cv2.warpPerspective(src=img, M=h, dsize=(width, height)) 

2. Methodology

The following figures demonstrate the generation process of some sample images.

Note:

  • In contrast to the original work, this module also allows to generate images where corners are outside the image boundaries.
  • The param mode_p (src/config) determines the ratio between included and excluded corners. By default, at least 70% of all generated images will only have included corners.
  • Dataset Format (stored as .mat file):
    X = (N, height, width, 3)  
    Y = (N, 8)                  # 4 * x,y (top left, top right, ...,bottom left)
  • Beside the possibility of generating an arbitrarily large dataset (see DataProducer), DocHomography Generator also allows to be used as Python generator (see DataGenerator), where data is only generated batch-by-batch. This is in particular useful, when the dataset is too big to fit into memory (Big Data). For example, in order to train a model using a python generator, one can use the fit_generator()-method provided by Keras.

3. Setup

  1. Download textures or background images (e.g. from DTD or MIT Indoor scenes dataset) into res/backgrounds as collection of images (remove intermediate folders).

  2. Insert Pdf images (as PNG) into /res/input as collection of images; Note: to convert PDFs to PNGs, one can use the scripts provided in /src/data_utils.py

  3. Install dependencies (using pip)

    pip install -r requirements.txt

4. File Structure

res                               
    ├── backgrounds                 # background images (gif not supported)              
    ├── input                       # pdf images 
    ├── output                      # location where to store generated dataset + corners as .mat file
src
    ├── unit_tests                  # unit tests demonstrating functionality
        ├── ...
    ├── dataGenerator.py            # Data Synthesis optimized for Keras fit_generator()
    ├── dataProducer.py             # Data Synthesis using multiprocessing (fixed set)
    ├── dataConfig.py               # all config params of data synthesis
    ├── dataUtils.py                # helper methods       
requirements.txt                    # dependencies (Python2.7) 

5. Usage

# init Augmenter with input, output and backgrounds
augmenter = AugmenterV2(input, output, backgrounds)

# e.g. generate 100 document images
augmenter.augmentDataset_master(max=100, mode_p=0.7)

For more information: see src/unit_tests

License

DocHomographyGenerator_license

doc-homography-generator's People

Contributors

nikolai10 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.