Coder Social home page Coder Social logo

Comments (6)

ashpreetbedi avatar ashpreetbedi commented on September 25, 2024 5

@sridharaiyer agree this is critical @ysolanky @jacobweiss2305 any takers?

from phidata.

ashpreetbedi avatar ashpreetbedi commented on September 25, 2024

@sridharaiyer mind sharing a good PDF to test this with?

from phidata.

ysolanky avatar ysolanky commented on September 25, 2024

@sridharaiyer The PDF Image reader is ready. It would be great it you could share an ideal pdf for your use case so that we can test further before releasing :)

from phidata.

llegomark avatar llegomark commented on September 25, 2024

@sridharaiyer Good day, team phidata! Thank you for your prompt response and willingness to address this critical feature request. I appreciate your dedication to continuously improving the library.

To aid in your testing, I would like to share the following PDF files that contain a mix of scanned images (text):

  1. https://www.deped.gov.ph/wp-content/uploads/DO_s2024_005.pdf
  2. https://www.deped.gov.ph/wp-content/uploads/DO_s2024_002.pdf

These PDFs are representative of the types of documents I frequently work with, and having the ability to extract text from images within PDFs would greatly enhance my workflow.

I have been using phidata for my personal projects and have found it to be an invaluable tool. Additionally, I have been following the progress of the library and the insightful video demos led by Sir @ashpreetbedi on Twitter.

Thank you once again for your efforts in making phidata an even more powerful and versatile library. I look forward to testing the PDF Image reader feature once it is released.

from phidata.

ashpreetbedi avatar ashpreetbedi commented on September 25, 2024

@sridharaiyer The PDFImageReader is now live in v2.4.8

We can import it using from phi.document.reader.pdf import PDFImageReader

@ysolanky please add some docs to help :)

@llegomark thank you for sharing the PDFs and your help with the product.

we appreciate your help with this so much

from phidata.

sridharaiyer avatar sridharaiyer commented on September 25, 2024

Thanks a lot team!! The PDF with an image containing text is indeed being read properly. So I am closing this ticket.

However, I have another question in regards to the knowledge base strategy. I will open another thread.
Thanks, once again!!

from phidata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.