Coder Social home page Coder Social logo

gogaku's Introduction

gogaku

Gogaku is an implementation of Nei Kato's Directional Feature Extraction algorithm for kanji representation (see citations below).

Without providing too many details on the algorithm, each kanji is read in as a 64x64 binary image and converted into an ordered set of 196 positive integers. The distance between two kanji can be reasonably modeled as the Euclidean distance between two of these 196-dimensional vectors, although other distance functions may be used to adjust performance.

This implementation compares an input image to each of the two thousand or so Jouyou kanji and returns the closest match.

requirements

Gogaku requires a reasonably modern version of the Go compiler to build, and training data generation requires Python 2.7 (or similar) and the Python Imaging Library. The automated scripts require a Bourne shell or similar.

setup

After cloning the gogaku repository, building can be performed by executing build.sh. Once the gogaku binaries are built, in particular the trainer binary, the gentrain.sh script may be executed to create the Jouyou training dataset.

Warning: the Jouyou dataset is fairly large. The set is included as a text file, but running gentrain.sh will create around 9MB of PNG images in a directory called img/training. Additionally, these files will be generated with utf-8 filenames, which may not display properly on your system.

execution

The recog binary is used to actually recognize kanji. It takes a 64x64 kanji image and a kanji database file as input. The kanji image should be binary colored with a white background and black strokes. However, anti-aliasing of strokes is not a big deal; any non-white pixel is treated as black. The Jouyou database file is generated by default at txt/db.txt by running gentrain.sh.

miscellanea

I've included the Arial Unicode MS font for rendering of the dataset. I'm not sure if it's legal, but I take pride in how quickly and diligently I respond to cease and desist letters.

known issues

The current default dataset is rendered in Arial Unicode MS, and as a result sometimes does not match well with natural, handwritten characters. Though accuracy is often quite good, I plan to soon write a parser for the ETL9B dataset which consists of actual handwritten kanji. I believe this will boost accuracy by quite a bit.

citations

1) Nei Kato, Masato Abe, and Yoshiaki Nemoto, "A Fine Classification Method of Handwritten Character by Using Automatic Learning Algorithm of Partial Area Matching," The Transactions of IEICE D-II(Japanese Edition), Vol. J78-D-II, No. 3, pp. 492-500, 1995.

2) Nei Kato, Masato Abe, and Yoshiaki Nemoto, "A Handwritten Character Recognition System by Using Improved Directional Element Feature and Subspace Method," The Transactions of IEICE D-II(Japanese Edition), Vol. J78-D-II, No. 6, pp. 922-930, 1995.

3) Nei Kato, Masato Suzuki, Shinichiro Omachi, Hirotomo Aso, and Yoshiaki Nemoto, "A Handwritten character Recognition System Using Directional Element Feature and Asymmetric Mahalanobis Distance," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 3, pp. 258-262, 1999.

gogaku's People

Contributors

bitbanger avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.