karolzak / boxdetect Goto Github PK
View Code? Open in Web Editor NEWBoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on scanned forms.
License: MIT License
BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on scanned forms.
License: MIT License
Shouldn't it also be possible to detect tables and cells of tables?
What if I wanted to detect all the cells of such a table?:
As described in #29 it would be great to have some kind of guidance on how to find the right config.
Hi!
Been having this issue for a while. Aaside from an extra warning log while running and while running my unit tests, I always got this warning but no major issues locally and with my company Jenkins job.
But recently my Jenkins job started to reject me those tests, so I tested it downloading boxdetect, adding the recomendation from the first image and now everything is working fine!
(The failed test has nothing to do with boxdetect hehe)
When trying to use boxdetect
in an AWS lambda, this error occurs when deploying:
[ERROR] Runtime.ImportModuleError: Unable to import module 'pd_ocr/handlers/s3_write': libGL.so.1: cannot open shared object file: No such file or directory Traceback (most recent call last):
The error occurs because it is trying to use opencv artifacts necessary for GUI interface which is not needed in a lambda.
Proposed solution: boxdetect
should use opencv-python-headless
instead of opencv-python
. That way these unneeded GUI artifacts are not included allowing boxdetect
to be used in a lambda.
I'm currently trying to get started with the package.
When I use the config's and try to get results using the steps in the readme, I just get WARNING: No rectangles were found in the input image.
, which is quite frustrating.
It's not very clear for me what exactly the different *config'*s do.
So it would be great to have some more in-depth documentation about that.
And it would be great to have some recommended settings which work for a wide variety of projects (although I guess that's not so easy).
Need to make sure docstrings for all the functions and classes are in place
Hi,
I am actually using this as a part of a OCR pipeline I am building in a commercial product. First I want to say thank you for building something that really works and is open source. AWS Textract does checkboxes at 6 cents a page, which is too expensive to be used at the load the project is going for. So thank you for this amazing library!
I was wondering if you had any insight about how to get the right parameters for an image of grainy quality. I have written an OpenCV pipeline that takes PDFs and splits them into images, then rotates them using a deskewing library, applies a crop, and then produces a cropped correctly rotated color image and a correctly rotated BW thresholded image. I am trying to run boxdetect on both color and thresholded images, and facing a few challenges.
I was wondering if you had some general tips on how to ascertain checkboxes only - I am picking up zeros, lowercase N's (especially with Serif fonts), and other things. I rarely get 4 checkboxes which is what there is in the sample image, I sometimes get 3, or 5, or 8, etc. I also confess I don't use the True/False too often, but I love the percentage feature and the cropped matrix of the box as I personally find it very accurate (I notice checked boxes typically are 55% as opposed to 25% black)
Since I have tally up the number of boxes I find and match them to a unfilled reference document and find the checkbox based on the percentage of the region, it is most important to avoid missing true positives, but it would also be nice to not have as many false positives.
Here are some parameters I am using, and here is a reference image (3 checkboxes are found except the one that says en rampant de toitures). In other images, n's and E's are picked:
rotated = {
"w": (25, 50),
"h": (25, 30),
"wh": (0.85, 1.15),
"scale": [1.2, 1.0, .9, .7, ],
"group": (2, 100),
"iterations": 5,
}
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 129, 27)
The post-rotated images come in a variety of sizes of 11 by 8.5 sheets with about +/-200 pixels of white padding, so it is difficult for me to have a width or height range, but wh ratio is generally easy, and I could stretch them to a defined width and height. I have actually found the px_threshold (I sometimes use 0.1, or 0.3) to be very helpful. How is that used exactly? Also, what do you think the recommended kernel is for a 3600 by 2600 image? Any help would be appreciated!
I am creating this issue to help anybody having the same issue with vertically aligned checkboxes not being detected well.
The group_size_range
config option gets overwritten to a hardcoded value of (1, 1) at the start of the get_checkboxes
pipeline. So setting that config option does nothing when using this function.
By default in the config the vertical_max_distance
option is set to 10, meaning if you are trying to detect vertically aligned checkboxes (like in a form) it will give really bad results as it will see the whole column as a single group. I don't know if this is intended and what the use case is. I don't quite understand the grouping logic in the library.
Ways to fix it would be to either set this option to 0, and then find and filter out unwanted close detections with your own needed logic. Or copy over the get_checkboxes
function without that first hardcoding line (but this might group horizontal checkboxes). I don't understand the difference between the vertical and the horizontal grouping but vertical grouping for checkboxes seems to be a bit faulty.
Hello maintainers,
I saw there was a commit recently to replace sklearn with scikit-learn in the repository's requirements, as the former is now deprecated. Will there be a new release (1.0.1?) that includes this change?
Thanks,
Tomi
Need to add to check and add graceful exception if no rects are found:
--> 109 mean_width = np.mean(rects[:, 2])
TypeError Traceback (most recent call last)
in
2
3 rects, grouping_rects, img, output_image = get_boxes(
----> 4 os.path.join(DATA_PATH, file_name), config=config, plot=False)
/media/shane/HD/anaconda3/envs/nlp/lib/python3.7/site-packages/boxdetect/pipelines.py in get_boxes(img, config, plot)
107 # merge rectangles into group if overlapping
108 rects = group_countours(cnts_list)
--> 109 mean_width = np.mean(rects[:, 2])
110 # mean_height = np.mean(rects[:, 3])
111 # group rectangles vertically (line by line)
TypeError: tuple indices must be integers or slices, not tuple
I am trying to run a basic demo of the get_boxes method and there seems to be an internal error or maybe I'm missing a new parameter:
from boxdetect import config
from boxdetect.pipelines import get_boxes
import matplotlib.pyplot as plt
config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1,100)
config.horizontal_max_distance_multiplier = 2
image_path = 'input\large.png'
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)
print("Indv boxes (green):", rects)
print("Grouped boxes (red): ",grouped_rects)
plt.figure(figsize=(25,25))
plt.imshow(output_image)
plt.show()
==========================================
Processing file: input\large.png
Traceback (most recent call last):
File "c:\repos\wwex\boxes.py", line 15, in
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\repos\wwex.venv\Lib\site-packages\boxdetect\pipelines.py", line 147, in get_boxes
cfg.update_num_iterations()
^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'boxdetect.config' has no attribute 'update_num_iterations'. Did you mean: 'dilation_iterations'?
Hi,
Thanks for the amazing work! I am trying to use the config to detect boxes on my image but it's not detecting all the boxes.What i should change?
Not sure why it's not detecting some of the boxes.Any ideas?
from boxdetect import config
cfg = config.PipelinesConfig()
cfg.width_range = (30,500)
cfg.height_range = (40,500)
cfg.scaling_factors = [1.0]
cfg.wh_ratio_range = (0.5,4.0)
cfg.group_size_range = (0,0)
cfg.dilation_iterations = 1
It seems like it's scalling the height (image.shape[0]
) with scaling_factor
and then passing this value as the desired width into imutils.resize()
boxdetect/boxdetect/pipelines.py
Lines 48 to 49 in 37147f0
Proposed fix:
image = imutils.resize(
image, width=int(image.shape[1] * scaling_factor))
I have 2 documents which are almost identical except one is a shrunk a bit and a bit more grainy.
Boxdetect can get all the checkboxes on DocA but only some of the checkboxes on DocB, with this configuration:
# important to adjust these values to match the size of boxes on your image
cfg.width_range = [(30, 70)]
cfg.height_range = [(30, 70)]
# w/h ratio range for boxes/rectangles filtering
cfg.wh_ratio_range = [(0.8, 1.2)]
# num of iterations when running dilation tranformation (to engance the image)
cfg.dilation_iterations = [1]
cfg.dilation_kernel = [(1,1)]
When I attempt to capture more checkboxes in DocB with this configuration:
# important to adjust these values to match the size of boxes on your image
cfg.width_range = [(30, 70),(40, 70),(40, 70)]
cfg.height_range = [(30, 70),(40, 70),(40, 70)]
# w/h ratio range for boxes/rectangles filtering
cfg.wh_ratio_range = [(0.8, 1.2),(0.8, 1.2),(0.8, 1.2)]
# num of iterations when running dilation tranformation (to engance the image)
cfg.dilation_iterations = [1,2,1]
cfg.dilation_kernel = [(1,1),(2,2),(1,4)]
I'm finding that the checkboxes boxdetect originally captured in DocA are no longer captured, though the original configuration is in index 0 of the configuration. Why is that?
Need unit tests for all the functions
This is not a bug but a request for help.
I am trying to identify the checkboxes in the attached image (clip2.png). The top 4 are identified but the bottom 2 are not. I've tried various dilation and kernel sizes but I haven't been able to successful get the box. At the same time I would like to be able to get rid of the peppering to avoid false positives as there are other docs that have checkmarks that are much smaller.
I've attached the configuration ([boxdetect_cfg.yaml.txt) being used as well.
Any suggestion will be appreciated.
Hi, amazing module! Very useful. I use it to detect student answers in a research project.
Some of the students make the crosses over the border of the checkbox. Most of the time those square checkboxes seem to be not detected, many others get detected reliably. I tried many different values in the config. Any suggestions on how to proceed a bit smarter than just guessing values?
I attached some images of undetected and detected ones (the blue marks are made by the program to do a manual check. If a checkbox is detected as "checked" the blue square is filled out).
cfg.width_range = (25,42)
cfg.height_range = (25,42)
cfg.scaling_factors = [0.5]
cfg.wh_ratio_range = (0.3, 1.6)
cfg.dilation_iterations = 0
Hi,
I am working with pdf files and I came across the box-detect library. Thanks for creating this amazing library. I am using the following PDF file
Form_49A.PDF
I am trying to annotate over the pdf file, however, for achieving this I was looking at ways to extract these boxes and then annotate. Is there a way to extract the coordinates for the boxes present in the pdf-file?
Make it possible to save/load configs to .yaml
or .json
files
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.