Comments (7)
It's unclear to me whether this could break any existing code, so my inclination is to combine this with any other planned breaking changes and release as v6.
from tesseract.js.
I would agree that the Node.js version should work similarly regardless of whether or not it is being run in Electron, so this condition should be patched if that is not the current behavior.
As an aside, is there a compelling reason to use the Node.js version of Tesseract.js rather than the browser version in an Electron application? My understanding was that Electron supports both, however it was considered a security best practice to avoid giving code access to the filesystem with Node.js integration where possible. However, I have minimal experience with Electron so honestly do not understand the tradeoffs well. The browser version of Tesseract.js can definitely be run using entirely offline resources, as seen in this example repo.
from tesseract.js.
@Balearica Thx for reply. In Electron, there are two types of processes: main process and render process. As I understand it, what you mentioned and the example you provided is using Tesseract.js in the render process. Upon inspecting the code, I found that the environment is 'webworker' and it utilizes fetch
to load language files.
However, in my scenario, I need to run Tesseract.js in the main process after receiving an IPC event from the render process, where the environment is detected as 'electron', and it uses 'node-fetch' to load language files, which does not support local file paths.
from tesseract.js.
@HQidea Can you make a minimal example repo that can be cloned to replicate this issue? I ran the snippet provided in the OP within the main.js
file in the example repo I posted (rather than the index-offline.html
file), however everything still ran as expected. Therefore, there is presumably additional context required to replicate this.
from tesseract.js.
@Balearica this is the repo that can reproduce the error.
I also tried to replicate the error using the repo you provided. However the results are inconsistent. Sometimes the environment in the worker thread is tested as 'electron', and sometimes as 'node'. When the environment is tested as 'node', everything goes well. I'm not sure why.
from tesseract.js.
@HQidea I was able to reproduce this issue using your repo, and was able to confirm that simply commenting out the special isElectron
case within getEnvironment
resolved the issue.
tesseract.js/src/utils/getEnvironment.js
Lines 8 to 9 in 83df363
I am hesitant to edit something without fully understanding what it does, so I reviewed the various issues/PRs regarding why a special case for Electron was introduced to begin with. These are linked below.
After reviewing these discussions, I believe Electron was added as a special case because people wanted to use Node.js in an Electron render process. You do not need special logic to use the Node.js version of Tesseract.js in the main Electron process, and you don't need a special logic to use the browser version of Tesseract.js in the render Electron processes. The only "special case" introduced by Electron is the theoretical ability to run Node.js code in what appears to be a browser context (window.document
is defined).
While my understanding of Electron is limited, my initial thought is that this special case is not something we need to support. Using Node.js in a renderer process appears to be strongly advised against in the Electron docs. Additionally, the browser version of Tesseract.js works perfectly well in Electron renderer processes, and is just as fast. If this understanding is correct, and there is not a compelling reason to (attempt to) support using the Node.js version in a renderer process, then we could just cut out the Electron-specific stuff entirely, which should solve your issue.
from tesseract.js.
@Balearica I totally agree with you on not using Node.js in a renderer process.
from tesseract.js.
Related Issues (20)
- Worker stuck on "loading language traineddata" HOT 4
- Updated types to infer output formats
- Inference of Chinese handwritten characters is bad HOT 3
- Add line size metrics (ascender, descender, size) to `line` objects in `blocks` output HOT 1
- Font attributes incorrect even when font is properly identified (`is_italic`, `is_serif`, etc.) HOT 1
- Focusing area HOT 1
- Multiple issues: Discussion
- Disable non-text output formats by default
- Tesseract - Running in Browser Console HOT 1
- Execution `worker.recognize` repeatedly causes "Out of Memory" error in JSFiddle HOT 5
- Error: Network error while fetching HOT 1
- how to use installed tessercat lib on windows for tesseract.js? HOT 1
- Auto fill forms by scanning ID cards
- Suppressing "Corrupt JPEG data: 1 extraneous bytes before marker 0xd9" output HOT 3
- Tesseract.js Bug on IBM i Server HOT 9
- Legacy model does not work for indic and arabic scripts due to Legacy data being removed
- Combine worker and scheduler interfaces
- Issue with Tesseract.js OCR Integration in Angular Application HOT 5
- Rectangles not working, pure white image interpreted as tildes HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tesseract.js.