Coder Social home page Coder Social logo

Comments (8)

andbue avatar andbue commented on July 18, 2024

predictor.predict_raw expects Iterable[np.ndarray], you are providing only numpy.ndarray. Try:

raw_image_generator = [cv_img]
for sample in predictor.predict_raw(raw_image_generator):
    ...

from calamari.

rajban94 avatar rajban94 commented on July 18, 2024

I will try and let you know. But i have already converted the image as cv image which is a numpy.ndarray, then why i am getting this error. Is it because of tfaip?

from calamari.

rajban94 avatar rajban94 commented on July 18, 2024

@andbue i am still getting the same error after changing the image as numpy.ndarray. can you let me know what is the error that i am getting?

from calamari.

andbue avatar andbue commented on July 18, 2024

The image has been numpy.ndarray before, now it should be a list or any kind of iterator before you put it in the predictor. Could you post a full example of your code as it looks right now, ideally with all imports and maybe even the out_10.jpg you're using?

from calamari.

rajban94 avatar rajban94 commented on July 18, 2024

@andbue i am sharing the code which i am using for end to end prediction. Please let me know where am i going wrong.

import cv2
import numpy as np
import os
import glob
from pdf2image import convert_from_path
import subprocess
import pandas as pd
import re
from calamari_ocr.ocr.predict.predictor import Predictor, PredictorParams

def generateImage(pdfFile, des = './images'):

    if pdfFile.split('.')[-1]=='pdf':
        name = os.path.basename(pdfFile).replace('.pdf','')
        images = convert_from_path(pdfFile,dpi=500,poppler_path = "C:\\Program Files (x86)\\poppler-0.68.0\\bin")
        for i in range(len(images)):
            images[i].save(des+'/'+name+'_page_'+ str(i) +'.jpg', 'JPEG')

def get_calamari_output(cropImg, index):

    predictor = Predictor.from_checkpoint(
        params=PredictorParams(),
        checkpoint='./models/cal_model.ckpt')

    calamari_output = {}
    for sample in predictor.predict_raw([cropImg]):
        inputs, prediction, meta = sample.inputs, sample.outputs, sample.meta

        pred_text = prediction.sentence
        avg_char_probability = 0
        for p in prediction.positions:
            if len(p.chars) > 0:
                avg_char_probability += p.chars[0].probability
        avg_char_probability /= len(prediction.positions) if len(prediction.positions) > 0 else 1
        #print(prediction.avg_char_probability)
        pred_confidence = round(avg_char_probability * 100, 1)
        calamari_output[index] = [pred_text, pred_confidence]
    return calamari_output

def drawBoundBox(imageFile):

    orig_img = cv2.imread(imageFile)
    gray = cv2.cvtColor(orig_img, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray,(11,11),0)
    _, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
    kernal = cv2.getStructuringElement(cv2.MORPH_RECT,(11,19))
    dilate = cv2.dilate(thresh,kernal, iterations=9)

    cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts)==2 else cnts[1]
    cnts = sorted(cnts, key=lambda x: cv2.boundingRect(x)[0])

    boxes = []
    for c in cnts:
        x,y,w,h = cv2.boundingRect(c)
        boxes.append([x,y,w,h])
    
    return boxes

def get_crops_dtls(imageFile):
    res = cv2.imread(imageFile)
    boxlist = drawBoundBox(imageFile)
    for idx, box in enumerate(boxlist):
        x,y,w,h = box[0],box[1],box[2],box[3]
        crop = res[y:y+h,x:x+w]
        #cv2.imwrite(dst+"/"+"out_"+str(idx)+'.jpg',crop)
        pred_text_dict = get_calamari_output(crop, idx)
    return pred_text_dict
        


if not os.path.exists('./images'):
    os.makedirs('./images')

files = glob.glob('./invoice/*')
for file in files:
    generateImage(file)

imgs = glob.glob('./images/*.jpg')
for img in imgs:
    predict_data = get_crops_dtls(img)

As suggested i have done: for sample in predictor.predict_raw([cropImg]) but still it's giving the same error as before.

from calamari.

andbue avatar andbue commented on July 18, 2024

Ah, now I get it: put the lines at the bottom in a if __name__ == "__main__":-block, otherwise the whole subprocess magic of calamari, tfaip and tensorflow is not going to work, producing the Broken pipe errors.

Further suggestions:

from calamari.

rajban94 avatar rajban94 commented on July 18, 2024

@andbue thank you so much for your help. It worked for me with
if __name__=="__main__":
But i am facing another issue i.e, if the crop image have only one line it's extracting the text correctly but whenever it's having multiple lines it's giving blank string as output. Any suggestion to resolve this without re-training the existing model? Thank you in advance.

from calamari.

andbue avatar andbue commented on July 18, 2024

Glad to hear that it worked for you!

Calamari is, as stated in the "About"-text, a "Line based ATR Engine", so it does not contain any code for image preprocessing, document analysis, or line segmentation. To segment paragraph blocks into lines, have a look at the ocropy segmenter I linked to earlier. A more complex alternative that also performs document layout analysis can be found at https://github.com/qurator-spk/eynollah.

from calamari.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.