Coder Social home page Coder Social logo

skvark / textractor Goto Github PK

View Code? Open in Web Editor NEW
20.0 6.0 6.0 244 KB

OCR application for Sailfish OS. Based on Tesseract OCR engine and Leptonica image processing library.

License: MIT License

QML 58.23% QMake 1.96% C++ 39.81%
tesseract-ocr qml leptonica c-plus-plus sailfishos

textractor's Introduction

Textractor

Textractor is an OCR application for Sailfish OS. Main features:

OCR can be run on:

  • an image taken with the app
  • an image selected from the device
  • a PDF file (one or multiple pages)

Cropping is supported in any reasonable quadrilateral arrangement and perspective correction is applied for the selection. User has access to advanced image preprocessing settings.

Found text can be edited or copied to clipboard. As SFOS is a true multitasking OS, the whole OCR process can be run on background while user can use the device for other purposes at the same time.

Documentation and Help

Textractor Documentation

Environment and building

To be able to build this, follow this Gist to setup the environment correctly: https://gist.github.com/skvark/49a2f1904192b6db311a

In short:

Add my repositories containing Tesseract OCR and Leptonica to the build machine targets.

Preprocessing

Tesseract OCR is just plain engine so Leptonica is used for preprocessing the image.

Currently following steps will be done before the image is passed to the engine for recognition:

  1. Image is first opened using QImage, dpi is set to 300, image is rotated according to device angle and the image is saved in jpg format.
  2. Load the jpg image with Leptonica and convert the 32 bpp image to gray 8 bpp image
  3. Unsharp mask
  4. Local background normalization with Otsu's algorithm
  5. Skew angle detection and rotation (Leptonica decides if the image needs to be rotated)

After those steps the image is passed to the Tesseract.

Test image and result

Original:

preview0

Preprocessed

preview01

Extracted text:

This is a lot of 12 point text to test the
ocr code and see if it works on all types
of file format.

The quick brown dog jumped over the
lazy fox. The quick brown dog jumped
over the lazy fox. The quick brown dog
jumped over the lazy fox. The quick
brown dog jumped over the lazy fox.






 D R I N K  COFFEE
L Do Stupid Faster
 With More Energy

textractor's People

Contributors

skvark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

textractor's Issues

Rotate images which are taken with Jolla's camera app

If user tries to process an image which was taken with Jolla's camera app and the orientation was not standard, image will appear in wrong orientation after preprocessing since afaik Jolla's camera app only adds exif orientation information and does not actually rotate the image and seems that Leptonica does not support exif data.

Fix: find a way to read the exif data and rotate the image before preprocessing.

Translation

Hi,

I really like the potential of this app, and would like it even more if I could have it in my native language. I took a look at the .ts file and found that it is far from complete. Are you planning to update translation file and implement localisation? If so, I would be the first to translate.
Great job anyway. Thanks!

Unload camera after taking a picture

Currently the camera is not unloaded and this makes the app consume too much power if user pushes the app to the background after recognition.

Skip image processing

Is it possible to completely skip the image processing? Send the image directly to tesseract?

Thank you for your work,
Cosmin Popescu.

Google code shutdown, data files downloading needs refactoring

Due to Google code shutdown Tesseract OCR codebase along other data was moved to GitHub. The data will be archived to Google Code archive, but it will be more future proof to refactor the language data downloading so that it will fetch the data from GitHub according to the release of Tesseract.

3.04 contains already a set of new language data files and Textractor still uses the 3.02 version files.

Todo:

Error after cropping on SFOS 2.0

Leptonica throws error:

findFileFormatStream: failed to read first 12 bytes of file

Something has changed in new update. Cropping itself seems to be working...

Some icon files missing on SFOS 2.0

Following icons are missing:

  • icon-camera-automatic
  • graphics-cover-camera

Fix: try to find icon which exists on SFOS 2.0 and in earlier systems too.

Add orientation locking functionality to camera page

If the phone is laying flat (camera pointing down) it is pretty much impossible to detect (or to know/guess) the correct orientation for the image. Solution is to add button to camera page which makes it able to toggle manual lock on different orientations.

Stuff needed: icon for all 4 orientations.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.