Coder Social home page Coder Social logo

Tesseract Process Timeout about textshot HOT 12 CLOSED

ianzhao05 avatar ianzhao05 commented on July 24, 2024
Tesseract Process Timeout

from textshot.

Comments (12)

ianzhao05 avatar ianzhao05 commented on July 24, 2024

Could you share the image that caused it to fail? This is a Tesseract issue that is caused by the input image. You can also try increasing the timeout in the code (default 2 seconds).

from textshot.

edwardsaunders7 avatar edwardsaunders7 commented on July 24, 2024

It doesn't appear to be an input image related issue, I can try it on multiple things and the same errors occur. I've attached some images that I sampled that produced the result

2020-06-27_00-28_1

2020-06-27_00-30

2020-06-27_00-31

from textshot.

ianzhao05 avatar ianzhao05 commented on July 24, 2024

Ok, thanks. Does it work if the text is dark on a light background? If not, maybe the screenshot is taking the wrong image.

from textshot.

edwardsaunders7 avatar edwardsaunders7 commented on July 24, 2024

Colour doesn't seem to have an effect, just tested with these two inputs:
2020-06-27_00-38_1
2020-06-27_00-38

from textshot.

ianzhao05 avatar ianzhao05 commented on July 24, 2024

Ok, if you add something like pil_img.show() at around line 85, does it show the correct image? By the way, what OS are you on?

from textshot.

edwardsaunders7 avatar edwardsaunders7 commented on July 24, 2024

Added pil_img.show() - shows the proper image.

Editing the timeout on line 90:

  • 10 seconds allows for more text to be converted before the timeout error occurs, however now means there is a much longer delay.

  • When attempting to copy large bodies of text the timeout error still occurs when the timeout is small, tested with increased time (30 seconds - not enough, 1 minute - works). For some reason changing the timeout to 60 seconds prevents the error, and also allows for the conversion to happen without delay (completes text conversion within 15 seconds), unlike with 10 seconds which shows a noticeable delay (over a minute).

On Manjaro Linux.

Further testing with 60second delay -

Multiple input images of varying text lengths - all complete quickly, without timeout error - perhaps making this a change to the git project would be beneficial?

from textshot.

ianzhao05 avatar ianzhao05 commented on July 24, 2024

The behaviour you are describing is really strange. On my laptop (mid-range specs), it usually takes around a second, even for large amounts of text. If you don’t mind, can you test with the tesseract executable directly and see how long it takes? https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage

from textshot.

edwardsaunders7 avatar edwardsaunders7 commented on July 24, 2024

Tested with a few input images.

  • Image 1: Tesseract = 26 seconds, textshot = 42 seconds
  • Image 2: Tesseract = 20 seconds, textshot = 29 seconds
  • Image 3: Tesseract = 30 seconds, textshot = 28 seconds
  • Image 4: Tesseract = 11 seconds, textshot = 15 seconds

I have added 2 of the images used (the other 2 contain personal information)

Image 3:

Test

Image 4:
eurotext

from textshot.

ianzhao05 avatar ianzhao05 commented on July 24, 2024

Thank you for your testing. It seems that Tesseract is taking a lot longer for you. What kind of processor does your computer have? Also, what is your Tesseract version? My PC has an i7-6700K, and my laptop an i5-7200U, and I haven't really had any timeout issues. I do agree that 2 seconds is too low however, so I will update that.

from textshot.

edwardsaunders7 avatar edwardsaunders7 commented on July 24, 2024

I am running a "4 x Intel Core i5-4690K CPU @ 3.50GHz" with "15.5GiB of RAM"

I tested textshot on my laptop and my desktop previously and had no issues, it only seemed to occur when I reinstalled textshot today. No idea why tesseract is taking longer for me, let me test a few more things and see if I can find an answer.

from textshot.

edwardsaunders7 avatar edwardsaunders7 commented on July 24, 2024

Update:
I just realised I had my VPN on, and so network speeds are likely the cause of the issue

I just disconnected my VPN, and tesseract (and textshot) are both working within a few seconds (using all the same test images)

Apologies for not realising that could be the issue beforehand!

from textshot.

ianzhao05 avatar ianzhao05 commented on July 24, 2024

I honestly didn't know before that Tesseract was affected by network speeds! No apology needed; glad it works now :)

from textshot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.