Comments (8)
predictor.predict_raw
expects Iterable[np.ndarray]
, you are providing only numpy.ndarray
. Try:
raw_image_generator = [cv_img]
for sample in predictor.predict_raw(raw_image_generator):
...
from calamari.
I will try and let you know. But i have already converted the image as cv image which is a numpy.ndarray, then why i am getting this error. Is it because of tfaip?
from calamari.
@andbue i am still getting the same error after changing the image as numpy.ndarray. can you let me know what is the error that i am getting?
from calamari.
The image has been numpy.ndarray before, now it should be a list or any kind of iterator before you put it in the predictor. Could you post a full example of your code as it looks right now, ideally with all imports and maybe even the out_10.jpg you're using?
from calamari.
@andbue i am sharing the code which i am using for end to end prediction. Please let me know where am i going wrong.
import cv2
import numpy as np
import os
import glob
from pdf2image import convert_from_path
import subprocess
import pandas as pd
import re
from calamari_ocr.ocr.predict.predictor import Predictor, PredictorParams
def generateImage(pdfFile, des = './images'):
if pdfFile.split('.')[-1]=='pdf':
name = os.path.basename(pdfFile).replace('.pdf','')
images = convert_from_path(pdfFile,dpi=500,poppler_path = "C:\\Program Files (x86)\\poppler-0.68.0\\bin")
for i in range(len(images)):
images[i].save(des+'/'+name+'_page_'+ str(i) +'.jpg', 'JPEG')
def get_calamari_output(cropImg, index):
predictor = Predictor.from_checkpoint(
params=PredictorParams(),
checkpoint='./models/cal_model.ckpt')
calamari_output = {}
for sample in predictor.predict_raw([cropImg]):
inputs, prediction, meta = sample.inputs, sample.outputs, sample.meta
pred_text = prediction.sentence
avg_char_probability = 0
for p in prediction.positions:
if len(p.chars) > 0:
avg_char_probability += p.chars[0].probability
avg_char_probability /= len(prediction.positions) if len(prediction.positions) > 0 else 1
#print(prediction.avg_char_probability)
pred_confidence = round(avg_char_probability * 100, 1)
calamari_output[index] = [pred_text, pred_confidence]
return calamari_output
def drawBoundBox(imageFile):
orig_img = cv2.imread(imageFile)
gray = cv2.cvtColor(orig_img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(11,11),0)
_, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
kernal = cv2.getStructuringElement(cv2.MORPH_RECT,(11,19))
dilate = cv2.dilate(thresh,kernal, iterations=9)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts)==2 else cnts[1]
cnts = sorted(cnts, key=lambda x: cv2.boundingRect(x)[0])
boxes = []
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
boxes.append([x,y,w,h])
return boxes
def get_crops_dtls(imageFile):
res = cv2.imread(imageFile)
boxlist = drawBoundBox(imageFile)
for idx, box in enumerate(boxlist):
x,y,w,h = box[0],box[1],box[2],box[3]
crop = res[y:y+h,x:x+w]
#cv2.imwrite(dst+"/"+"out_"+str(idx)+'.jpg',crop)
pred_text_dict = get_calamari_output(crop, idx)
return pred_text_dict
if not os.path.exists('./images'):
os.makedirs('./images')
files = glob.glob('./invoice/*')
for file in files:
generateImage(file)
imgs = glob.glob('./images/*.jpg')
for img in imgs:
predict_data = get_crops_dtls(img)
As suggested i have done: for sample in predictor.predict_raw([cropImg]) but still it's giving the same error as before.
from calamari.
Ah, now I get it: put the lines at the bottom in a if __name__ == "__main__":
-block, otherwise the whole subprocess magic of calamari, tfaip and tensorflow is not going to work, producing the Broken pipe errors.
Further suggestions:
- do not call
Predictor.from_checkpoint
for each and every line, this is going to be very slow. Instantiate the object once and then just throw all of the images at it - if you are looking for a simple line segmentation algorithm, have a look at https://github.com/cisocrgroup/ocrd_cis/blob/master/ocrd_cis/ocropy/segment.py
from calamari.
@andbue thank you so much for your help. It worked for me with
if __name__=="__main__":
But i am facing another issue i.e, if the crop image have only one line it's extracting the text correctly but whenever it's having multiple lines it's giving blank string as output. Any suggestion to resolve this without re-training the existing model? Thank you in advance.
from calamari.
Glad to hear that it worked for you!
Calamari is, as stated in the "About"-text, a "Line based ATR Engine", so it does not contain any code for image preprocessing, document analysis, or line segmentation. To segment paragraph blocks into lines, have a look at the ocropy segmenter I linked to earlier. A more complex alternative that also performs document layout analysis can be found at https://github.com/qurator-spk/eynollah.
from calamari.
Related Issues (20)
- calamari-eval: confusion table miscalculates relative frequency HOT 3
- Error when convert old trained model to latest version model HOT 1
- Got exception during training HOT 4
- calamari-ocr 2.2.2 on ubuntu 22.04 partial success, difficulty with GPU software
- Prediction from calamari trained .pb model HOT 5
- setup.py on Ubuntu20.04: tensorflow is wrong version HOT 7
- Model very sensitive on PNG input HOT 3
- calamari/1.0: hold Tensorflow and Protobuf dependencies HOT 6
- What is the accuracy on Chinese/Japanese text? HOT 2
- Attention layer
- "No training configuration" for code that should not have one HOT 5
- Downgrading of models is not supported (5 to 2). Please upgrade your Calamari instance (currently installed: 1.0.6) HOT 4
- UnknownArgumentError HOT 7
- Release confusion HOT 4
- calmari/1.0: Fix 1.0.x models for Python 3.11 HOT 9
- allow SpatialDropout for Conv layers
- use annotated baseline instead of CenterNormalizer.measure
- network topology at CNN-RNN interface
- please release v1.0.7 off calamari/1.0 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from calamari.