Comments (6)
Hi there,
Small question. What do you mean by "the algorithm you use to recognize the texts"? Do you mean how I separate the characters, how the NN works or something else?
from swiftocr.
Wow, Thanks for such a fast answer!
I mean how are you separating characters and what are you passing to the input of the NN. I would like to know which components can I tweak to get better character recognition and, for instance, if a learning rate of 0.1 makes sense or not.
from swiftocr.
I'm using the Swift-AI framework for the NN and a Connected-component labeling algorithm for getting the bounding boxes of the characters.
I have looked at your training sample and I realized you simply set the parameters to fixed ones (0.7 learning rate , 0.4 momentum ...) and you have a fixed 1-hidden layer size.
Since this was my first time working with Neural Networks, I searched the Internet for what learning rate and momentum I should use ^^
I would have loved to use a NN library that allows more than one hidden layer but I couldn't find one.
Parameters to tweak in SwiftOCR.swift:
recognizableCharacters
(Which characters were used / are used for training the NN (see #34))globalNetwork: hidden, learningRate, momentum, activationFunction and errorFunction
(This only has an affect on training. I only achieved good training results when using.CrossEntropy(average: false)
as the errorFunction. If you use the Training App, you have to change them on line 32 and 97)xMergeRadius and yMergeRadius
(see #1)confidenceThreshold
(The confidence for recognizing a character has to be bigger or equal to this threshold. If it is too high, it may filter too much 'noise', if it is too low, if may not filter enough.)//Filter blobs
(line 347 - 360) (This filters the connected components. E.g. If the blob is thinner than 1% of the input image width, thennotToThin
will befalse
and the blob will get filtered out.)//Filter rects: - Not to small
(line 417 - 429) (the same as//Filter blobs
but only checks if the width and height of the blob (after merging) is OK)cropSize
(line 469) (How big the final image (of the blob) should be for recognition. If you change the cropSize, you have change the number of inputs of the NN tocropSize.width * cropSize.height + 1
)
Parameters to tweak in SwiftOCRTraining.swift (Only for training):
trainingImageNames
(These images will be used for adding noise in the background when training)trainingFontNames
(The font names used tor training. Only important when you aren't using the Training app)numberOfTrainImages and numberOfTestImages
(how large the training and testing set should be)errorThreshold
(When it should stop training. Only kind of important when you aren't using the Training app)//Distortions
(line 236 - 246) (CGAffineTransform: How much the image should get distorted for training)
SwiftOCRDelegate:
func preprocessImageForOCR(inputImage: OCRImage) -> OCRImage?
(Custom image preprocessing)
I think that for the beginning this is more than enough parameters to fiddle with.
from swiftocr.
Wow! Thanks! I think you might put that on the readme or the wiki of the library. Some people might find it really interesting.
What it is usually done when trying to get the best NN is to train tons of them in parallel with different settings to check which one works best. then you use the best settings to train your network. That is why I asked you for this. 👍
from swiftocr.
@garnele007 Does errorThreshold increase in accuracy when you provide a higher or lower number?
from swiftocr.
@RollingGoron The lower the number, the more accurate (and time-consuming) the training should get.
from swiftocr.
Related Issues (20)
- Cannot Build SwiftOCR with Swift 4.2 HOT 4
- Cannot build for macOS using CocoaPods
- How to install Swift OCR using Swift Package Manager?
- Can't Invoke NSimage image in method SwiftOCRInstance.recognize! HOT 1
- Any Wrapper for another programming language?
- Doesn't work on Xcode 10.2
- Low accuracy HOT 1
- Update recognizableCharacters not working
- Fix for crash in multithreaded apps
- crashed in FFNN.swift at line 228 HOT 1
- support for Xcode 11 Beta HOT 2
- Can SwiftOCR detect any language text or just english language ?
- No Output String HOT 4
- Statement about accuracy
- please support swift5.0 HOT 11
- Carthage not working
- Not compatible with Xcode 12 beta5 HOT 4
- cocoapods install error. not search tag 1.2.1
- No such module 'GPUImage' HOT 1
- Can the ios.deployment_target be lowered to 11.0?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from swiftocr.