Coder Social home page Coder Social logo

ramsailopal / doc-scan Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 2.29 MB

A demonstration of document QR Code/text scanning using Tesseract and opencv

Python 2.24% Shell 1.16% HTML 3.76% JavaScript 91.91% CSS 0.93%
opencv python3 qr-code qr-generator tesseract-ocr

doc-scan's Introduction

Doc-Scan

A demonstration of document QR Code/text scanning using Tesseract and opencv

This demonstration first builds a jpg version of the following pdf document:

https://www.va.gov/vaforms/va/pdf/VA0730b.pdf

This jpg version is then used to display a web form along with a QR code for a unique document reference.

Once the web form is then printed, filled out by hand and scanned as a jpg, it is then processed to extract the printed/handwritten text along with the text associated with the QR code.

Deployment

git clone https://github.com/RamSailopal/Doc-Scan.git

cd Doc-Scan

docker-compose up

The script pdfconvert/convert.py is used to generate the initial jpeg.

The web form can then be viewed in a browser by navigating to:

http:dockerserveraddress:8080?ref=testref

Where testref is the reference to be translated into a QR code.

Once the document is filled and scanned, the resulting jpeg is then used to output text using the script pdfscan/scan.py

Demonstration

This demonstration takes the following jpg:

https://github.com/RamSailopal/Doc-Scan/blob/main/pdfscan/FilledOut.jpg

It then processes the file to generate the following text

https://github.com/RamSailopal/Doc-Scan/blob/main/pdfscan/pdfscanout.txt

Findings

The initial web form had to be scaled out to display on one page and this effected the quality of the jpeg and subsequently the OCR results. Printed text was fine, but hand written text proved difficult to process acccuratly. QR codes were not processed at all.

As a comparison, A "screen grab" of part of the web form was taken and then the mouse used to add text (as if it were a pen). The resulting jpg can be viewed here:

https://github.com/RamSailopal/Doc-Scan/blob/main/pdfscan/doc-out1.png

The processed output can be seen here:

https://github.com/RamSailopal/Doc-Scan/blob/main/pdfscan/pdfscanout1.txt

With original scaling and no loss of quality, the QR code is processed correctly as well as the printed text. The mouse written text is again "patchy"

Running your own examples

Once a form is printed, filled and scanned, add it to the pdfscan folder. Once this has been done run:

docker exec -it pdfscan /bin/bash -c 'cd /home/pdfscan && python3 scan1.py <imagefilename> > <nameofoutputfile>'

i.e.

docker exec -it pdfscan /bin/bash -c 'cd /home/pdfscan && python3 scan1.py scannedimage.jpg > outputtext.txt'

The output data will then be available to be viewed in the file pdfscan/outputtext.txt

Improvements

In terms of hand written text, tesseract can be improved with "training" - https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html

References

Tesseract - https://tesseract-ocr.github.io/

Python Tesseract - https://pypi.org/project/pytesseract/

OpenCV QR Code detection - https://docs.opencv.org/4.x/de/dc3/classcv_1_1QRCodeDetector.html

Web QR Code generator - https://github.com/kazuhikoarase/qrcode-generator

doc-scan's People

Contributors

ramsailopal avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.