Coder Social home page Coder Social logo

crfocr's Introduction

Usage

To train a model:

python ocr_train.py [-h] -train TRAIN -model MODEL [-alphabet ALPHABET]
                    [-maxiter MAXITER] [-log LOG]

required arguments:
  -train TRAIN        Regex pattern of training files
  -model MODEL        Directory of model files

optional arguments:
  -h, --help          show this help message and exit
  -alphabet ALPHABET  String of all possible character labels, default = 'etainoshrd'
  -maxiter MAXITER    Maximum iteration for L-BFGS optimization, default = 1000
  -log LOG            Print log to stdout if 1, default = 1

For example, using the provided training data in the data directory:

python ocr_train.py -train data/train\* -model model -alphabet etainoshrd -maxiter 50 -log 0 

To test a model:

python ocr_test.py [-h] -test TEST -model MODEL [-alphabet ALPHABET]
                   [-tag TAG] [-score SCORE]

required arguments:
  -test TEST          Regex pattern of test files
  -model MODEL        Directory of model files

optional arguments:
  -h, --help          show this help message and exit
  -alphabet ALPHABET  String of all possible character labels, default = 'etainoshrd'
  -tag TAG            Print predicted labels to stdout if 1, default = 0
  -score SCORE        Calculate and print prediction accuracy to stdout if 1, default = 1

To run test using the provided test data and pretrained model:

python ocr_test.py -test data/test\* -model model

To load a model and use it to make predictions:

import crf
import string
import util

theta = util.read_model('model_directory')
data = util.read_data('filename_pattern')
alphabet = string.ascii_lowercase

predictions = crf.predict(theta, data, alphabet)

or see example.py for more details.

The dataset required by the -train option in ocr_train.py, -test in ocr_test.py, and util.read_data is a set of files that can be captured using the specified regular expressions. Each data case is a text file of binary pixel values of the text image and has to be in its own file. A row corresponds to a character position, where the first column of each row is the label and the remaining columns are binarized pixel values of the character image. The columns are space separated. The files can use any filename that can be captured using regular expressions. See the data directory for an example of naming and formatting of the training and test files. Your dataset has to follow the same format.

Running ocr_train.py outputs state-params.txt and transition-params.txt model files in the specified model directory. The -model option in ocr_test.py and util.read_model require a model directory containing state-params.txt and transition-params.txt files.

A pretrained model is included in the model directory, trained on 400 data cases using a limited alphabet of 10 most frequently used characters in English: “etainoshrd”. The training dataset used to obtain this model can be found in the data directory.

Implementation Details

Written in Python 2.7.6, NumPy 1.9.1, and SciPy 0.15.1.

Evaluation

The model obtained an accuracy of 96.87% trained on 400 and tested on 200 data cases of 10 most commonly used English letters ('etainoshrd') using the dataset provided in the data directory . A more detailed summary can be found in the eval directory.

crfocr's People

Contributors

deborausujono avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.