paperspace / dataaugmentationforobjectdetection Goto Github PK
View Code? Open in Web Editor NEWData Augmentation For Object Detection
Home Page: https://blog.paperspace.com/data-augmentation-for-bounding-boxes/
License: MIT License
Data Augmentation For Object Detection
Home Page: https://blog.paperspace.com/data-augmentation-for-bounding-boxes/
License: MIT License
I am using a sequence of transforms - randomHSV, randomscale, randomsheer, randomrotate and at the end I get a strange error
I am not sure what the problem is
corners = np.hstack((corners, np.ones((corners.shape[0],1), dtype = type(corners[0][0])))) IndexError: index 0 is out of bounds for axis 0 with size 0
I came across this useful library however, I noticed some unexpected behaviour when I applied the examples to my images. Unfortunately, in the example below, it seems that not only is the image being rotated it is also being scaled down. It also seems the bounding box doesn't get scaled-down alongside the image.
The problem seems to be particularly prevalent with rotations.
Here is the code for generating both images:
bboxes = np.array([list(details[image_name]['bbox'][0].values())[1:]], dtype=float)
print(bboxes)
img = cv2.imread(data_dir + "images/" + image_name + ".jpg")
plotted_img = draw_rect(img, bboxes)
plt.imshow(plotted_img)
plt.show()
img1, bboxes1 = RandomRotate(180)(img.copy(), bboxes.copy())
plotted_img = draw_rect(img1, bboxes1)
plt.imshow(plotted_img)
plt.show()
Here are the images:
Imgur
Wondering if one of the authors could offer some explanation (intended effect or bug and why) and/ or a function to modify to fix this.
According to the examples in the readme, rotation and shearing are not correctly implemented. Any update in progress?
Since this project is for rectangle bounding boxes, will there be a new version for polygon bounding boxes?
I am not sure I should write it for myself or just wait for it.
Some transforms could cut boxes or even delete them. So it's hard to know the label information about the output boxes. because we don't know which boxes were deleted.
Hi, thanks for your nice work! But I have a small probelm: I have noted that in the rotation/translation/.. methods, there may be some black regions left in the image after augumentation. Won't these black regions harm the performance of our model?
What is the difference between resizing and scaling? Seems like scaling is just a special case of resizing.
Hi I can't easily see the licence for the project is it released under Apache 2.0 perhaps?
IndexError Traceback (most recent call last)
in
34 transforms = Sequence([RandomHorizontalFlip(1), RandomScale((0.8,0.9), diff = True), RandomRotate(10)])
35
---> 36 img, bboxes = transforms(img, bboxes)
37 cv2.imwrite(os.path.join('JPEGImages2',file), img)
38 plt.imshow(draw_rect(img, bboxes))
~/card/DataAugmentationForObjectDetection/data_aug/data_aug.py in call(self, images, bboxes)
853
854 if random.random() < prob:
--> 855 images, bboxes = augmentation(images, bboxes)
856 return images, bboxes
~/card/DataAugmentationForObjectDetection/data_aug/data_aug.py in call(self, img, bboxes)
456
457
--> 458 corners[:,:8] = rotate_box(corners[:,:8], angle, cx, cy, h, w)
459
460 new_bbox = get_enclosing_box(corners)
~/card/DataAugmentationForObjectDetection/data_aug/bbox_util.py in rotate_box(corners, angle, cx, cy, h, w)
214
215 corners = corners.reshape(-1,2)
--> 216 corners = np.hstack((corners, np.ones((corners.shape[0],1), dtype = type(corners[0][0]))))
217
218 M = cv2.getRotationMatrix2D((cx, cy), angle, 1.0)
IndexError: index 0 is out of bounds for axis 0 with size 0
in clip_box
function, if x1<x2 or y1<y2 is not true, a area wil still be returned, but there is no box.
I am not familiar to python, however this approach seems to be great for the type augmentations I need.
Since it is necessary to use pickle format, how can I change my txt file, which order is
classes x1 x2 y1 y2
to pickle format, which order needs to be the following?
x1 y1 x2 y2 classes
Using RandomRotate, I get the following error:
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter
(typecode 'l') according to the casting rule ''same_kind''
I was able to fix the line:
new_bbox[:,:4] /= [scale_factor_x, scale_factor_y, scale_factor_x, scale_factor_y]
to
np.true_divide(new_bbox[:,:4], [scale_factor_x, scale_factor_y, scale_factor_x, scale_factor_y], out=new_bbox[:,:4], casting='unsafe')
The same problem occurs when using RandomShear, for both ufunc true_divide and ufunc add, however, I am unable to fix it by using np.add and np.true_divide. The resulting bounding boxes are incorrect.
will i get bounding box for each augmentation separately ?
For example, https://github.com/Paperspace/DataAugmentationForObjectDetection/blob/master/data_aug/data_aug.py#L30
This should be corrected to Transformed
hi, firstly, thanks for this fantastic repo!
In your code, when you use opencv to load image, you changed its color channel by [:,:,::-1], right?
when I run your code and use opencv' imwrite to save it, the output is just weird, I think its the channel's problem, Do I need to convert rgb back to bgr before I save my image?
I ran jypyter demo of the project, when i want to change the brightness channel, but i can't get the correct result.
The RandomHSV() function use img += hsv (RGB channel += hsv) to get the result, but is that correct? Should convert the img from RGB channel to HSV channel first, and then use img += hsv?
This is a great initiative for object detection use cases. I am also struggling with augmentation issues for my problem.
I would love to contribute to this repository. How do I get started?
Is it possible to add random copy-paste support in the future?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.