Coder Social home page Coder Social logo

bill-wangbiao / ocr-tess4j-rest Goto Github PK

View Code? Open in Web Editor NEW

This project forked from arun0009/ocr-tess4j-rest

0.0 2.0 0.0 526 KB

OCR Tess4J - Java, Spring Boot Microservice, Optional MongoDB with maven, gradle build

License: Apache License 2.0

Java 100.00%

ocr-tess4j-rest's Introduction

ocr-tess4j-rest

ocr-tess4j-rest - Java Wrapper for Tesseract OCR with Rest API built over Tess4j (http://tess4j.sourceforge.net).

Tess4J is a JNA wrapper for Tesseract OCR API it provides character recognition support for common image formats, and multi-page images. The library has been developed and tested on Windows and Linux.

Installing dependencies

Installing Tesseract
https://code.google.com/p/tesseract-ocr/

Install ghostscript (for PDF to Text) * If you are going to do PDF to text as well*
http://www.ghostscript.com/download/gsdnld.html

*On Mac the easiest way is to use homebrew:

brew install tesseract
brew install gs


ocr-tess4j-rest uses:

  • Spring Boot for Rest
  • Spring Boot Data for connecting with mongo db (/v1 endpoint uses mongo).
  • Image + Text (from OCR) is stored in mongo db for /v1 endpoint.
  • Text of uploaded image is displayed on console/log if you use /v0.9 (Testing or if you want to use something else than mongo).
  • Just run Tess4jV1.java and it will spin up an embedded Tomcat on localhost 8080
  • Rest Assured is used for testing rest (Tess4jV1). Just remove @Ignore on the Tess4jV1SmokeTest and run the rest test.
  • Logback for logging.
  • Graddle for build (or)
  • Maven for build.

*This version wraps tess4j as dependency and pulls it as a dependency jar and has spring boot upgraded to 1.1.9.
*Tested with JDK 1.7.72, Tesseract 3.02.02 on MAC.

ocr-tess4j-rest's People

Contributors

arun0009 avatar

Watchers

James Cloos avatar bill avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.