Coder Social home page Coder Social logo

resize_dataset_pascalvoc's Introduction

Resize-pascal-voc

It's a package for resizing ALL your dataset images (into local folder) and changing the pascal_VOC coodinates in your .xml files.

Index

How it works?

  • Walk by paths and searching by .jpg and .xml files
  • Resize the image and change the XY coordinates of each object in xml file.
  • Save the new files into output path. Folder structure example for more details
  • If save_box_images = 1, draw the boundies box in the resized image and save it in an other file(output_path/boxes_images/))

Warnings

1º: Don't worry if you have a big folder structure with many nested folders, the package will walk recursively in your dataset folder and recreate the same structure into output_path.

2º: The .jpg and .xml files must be in the same folder

Install requirements

pip install -r requirements.txt

Usage

python3 main.py -p <IMAGES_&_XML_PATH> --output <IMAGES_&_XML> --new_x <NEW_X_SIZE> --new_y <NEW_X_SIZE> --save_box_images <FLAG>"

Example

python3 main.py -p /home/italojs/Pictures/dataset --output ./output --new_x 150 --new_y 150 --save_box_images 1

Folder structure example

Look this randon folder structure with ALL my dataset

dataset
├── IMG_20181109_165212.jpg
├── IMG_20181109_165212.xml
├── IMG_20181109_165213.jpg
├── IMG_20181109_165213.xml
├── test
│   ├── IMG_20181109_163524.jpg
│   ├── IMG_20181109_163524.xml
│   ├── IMG_20181109_163525.jpg
│   └── IMG_20181109_163525.xml
├── train
│   ├── class1
│   │   ├── IMG_20181109_162519.jpg
│   │   ├── IMG_20181109_162519.xml
│   │   ├── IMG_20181109_162523.jpg
│   │   └── IMG_20181109_162523.xml
│   ├── class2
│   │   ├── IMG_20181109_162814.jpg
│   │   ├── IMG_20181109_162814.xml
│   │   ├── IMG_20181109_162818.jpg
│   │   └── IMG_20181109_162818.xml
│   ├── IMG_20181109_163315.jpg
│   ├── IMG_20181109_163315.xml
│   ├── IMG_20181109_163316.jpg
│   └── IMG_20181109_163316.xml
└── validation
    ├── IMG_20181109_164824.jpg
    ├── IMG_20181109_164824.xml
    ├── IMG_20181109_164825.jpg
    └── IMG_20181109_164825.xml

After executed the package with:

python3 main.py -p /home/italojs/Pictures/dataset --output ./output --new_x 150 --new_y 150 --save_box_images 1

The package will resize the images, rewrite the xml files and create the same folder structure into output path with new images and xml files.

output
├── boxes_images
│   ├── boxed_IMG_20181109_165212.jpg
│   └── boxed_IMG_20181109_165213.jpg
├── IMG_20181109_165212_new.jpg
├── IMG_20181109_165212_new.xml
├── IMG_20181109_165213_new.jpg
├── IMG_20181109_165213_new.xml
├── test
│   ├── IMG_20181109_163524_new.jpg
│   ├── IMG_20181109_163524_new.xml
│   ├── IMG_20181109_163525_new.jpg
│   └── IMG_20181109_163525_new.xml
├── train
│   ├── class1
│   │   ├── IMG_20181109_162519_new.jpg
│   │   ├── IMG_20181109_162519_new.xml
│   │   ├── IMG_20181109_162523_new.jpg
│   │   └── IMG_20181109_162523_new.xml
│   ├── class2
│   │   ├── IMG_20181109_162814_new.jpg
│   │   ├── IMG_20181109_162814_new.xml
│   │   ├── IMG_20181109_162818_new.jpg
│   │   └── IMG_20181109_162818_new.xml
│   ├── IMG_20181109_163315_new.jpg
│   ├── IMG_20181109_163315_new.xml
│   ├── IMG_20181109_163316_new.jpg
│   └── IMG_20181109_163316_new.xml
└── validation
    ├── IMG_20181109_164824_new.jpg
    ├── IMG_20181109_164824_new.xml
    ├── IMG_20181109_164825_new.jpg
    └── IMG_20181109_164825_new.xml

Results

Original Image

Here wee have the original image (650x590) an your xml:

original_image

<annotation>
  <folder>imagens</folder>
  <filename>IMG_20181109_165212</filename>
  <path>[IMAGE_PATH]\IMG_20181109_165212.jpg</path>
  <source>
    <database>Unknown</database>
  </source>
  <size>
    <width>650</width>
    <height>590</height>
    <depth>3</depth>
  </size>
  <segmented>0</segmented>
  <object>
    <name>CNH</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>74</xmin>
      <ymin>192</ymin>
      <xmax>267</xmax>
      <ymax>499</ymax>
    </bndbox>
  </object>
  <object>
    <name>CNH</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>297</xmin>
      <ymin>168</ymin>
      <xmax>483</xmax>
      <ymax>478</ymax>
    </bndbox>
  </object>
  <object>
    <name>CNH</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>310</xmin>
      <ymin>1</ymin>
      <xmax>499</xmax>
      <ymax>103</ymax>
    </bndbox>
  </object>
</annotation>

Resized image

Here is the resized image (150x150) and new xml:

resized_image

<annotation>
  <folder>imagens</folder>
  <filename>IMG_20181109_165212</filename>
  <path>C:\Users\pc-casa\Music\imagens\IMG_20181109_165212.jpg</path>
  <source>
    <database>Unknown</database>
  </source>
  <size>
    <width>650</width>
    <height>590</height>
    <depth>3</depth>
  </size>
  <segmented>0</segmented>
  <object>
    <name>CNH</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>17</xmin>
      <ymin>49</ymin>
      <xmax>62</xmax>
      <ymax>127</ymax>
    </bndbox>
  </object>
  <object>
    <name>CNH</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>69</xmin>
      <ymin>43</ymin>
      <xmax>111</xmax>
      <ymax>122</ymax>
    </bndbox>
  </object>
  <object>
    <name>CNH</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>72</xmin>
      <ymin>0</ymin>
      <xmax>115</xmax>
      <ymax>26</ymax>
    </bndbox>
  </object>
</annotation>

Boundies box image

Here is a resized image (150x150), but with the boundies box drawed:

box_image

The Calculus Behind the Scenes:

For an original image with 650x590 to be resized by 150x150

  • scaleX = 150/650 = 3/13

  • scaleY = 150/590 = 15/59

  • Original:

$$x_{min} = 74$$

$$y_{min} = 192$$

$$x_{max} = 267$$

$$y_{max} = 499$$

  • Scalling:

$$x_{min} = 74\times 3/13 = 17.08 \approx 17$$

$$y_{min} = 192\times 15/59 = 48.81\approx 49$$

$$x_{max} = 267\times 3/13 = 61.62 \approx 62$$

$$y_{max} = 499\times 15/59 = 126.86\approx 127$$

The same calculation is used to calculate the coordinates of the bounding boxes.

Parameters

To know more about parameters use python main.py -h:

usage: main.py [-h] -p DATASET_PATH -o OUTPUT_PATH -x X -y Y [-m MODE] [-s SAVE_BOX_IMAGES]
Parameter Description
-h, --help Show this help message and exit
-p --path DATASET_PATH Path to dataset data ?(image and annotations).
-o --output OUTPUT_PATH Path that will be saved the resized dataset
-x --new_x X The new x images size / scale / percentage / target
-y --new_y Y The new y images size / scale / percentage / target
-m --mode MODE The resize mode: size, scale, percentage or target
-s --save_box_images SAVE_BOX_IMAGES If True, it will save the resized image and a drawn image with the boxes in the images

resize_dataset_pascalvoc's People

Contributors

brenoav avatar darth-vader-lg avatar haofanwang avatar hoonkai avatar italojs avatar jakecowton avatar sarmientoj24 avatar zhelyo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

resize_dataset_pascalvoc's Issues

Image.py resize function contains bug.

I resized images and annotations using the code, but seems like there is a problem with the code.
The bounding box goes out of the box or sometimes the bounding box looks flipped horizontally.

something is wrong here:

 **scale_x = newSize[0] / image.shape[1]
 scale_y = newSize[1] / image.shape[0]**

image = cv2.resize(image, (newSize[0], newSize[1]))

newBoxes = []
xmlRoot = ET.parse(xml_path).getroot()
xmlRoot.find('filename').text = image_path.split('/')[-1]
size_node = xmlRoot.find('size')
size_node.find('width').text = str(newSize[0])
size_node.find('height').text = str(newSize[1])

for member in xmlRoot.findall('object'):
    bndbox = member.find('bndbox')

    xmin = bndbox.find('xmin')
    ymin = bndbox.find('ymin')
    xmax = bndbox.find('xmax')
    ymax = bndbox.find('ymax')
    
    xmin.text = str(np.round(int(float(xmin.text)) * scale_x))
    ymin.text = str(np.round(int(float(ymin.text)) * scale_y))
    xmax.text = str(np.round(int(float(xmax.text)) * scale_x))
    ymax.text = str(np.round(int(float(ymax.text)) * scale_y))

    newBoxes.append([
        1,
        0,
        int(float(xmin.text)),
        int(float(ymin.text)),
        int(float(xmax.text)),
        int(float(ymax.text))
        ])

boxed_10018_00004277

Is the aspect ratio preserved?

I am interested in using this code to resize a dataset into dimensions where both width and height are multiples of 16. For example 320x480 or 400x160 (both width and height are evenly divisible by 16). Is it possible to do this using this package? Will the aspect ratio of the images be preserved, and if so how is padding/border handled?

Thanks in advance for your response and any suggestions or comments.

Combine two images in one and two xml into one

Hi,
I have converted two images and two xmls to 416 x 208 size.
Now I want to combine two images into one , so it will be one image with size 416 x 416 and I also want to merge their bounding box values from two xml files into one new xml file. How I can do this task?

Waiting for your reply. Thank you very much.

save_box_images does not seem to work recursively

python3 main.py -p /home/<path_to_dataset>/ --output ./output --new_x 400 --new_y 300 --save_box_images 1 produces boxes_images only in the upper most directory (output folder in this case) and does not produce boxed images for other directories recursively in dataset folder.
Adding this line:
if args.save_box_images:
create_path(''.join([output_path, '/boxes_images']))
right after the line - create_path(output_path) in the for loop in main.py fixed the issue.

Let me know if i am doing something wrong. Thanks for making this useful tool :-)

Example in the README.md

Hi, @italojs.

On the README.md there's confusion with the example. Check the resized image, it's 150x150 and not 200x200.

For 150x150:

  • ScaleX = 150/650 = 3/13
  • ScaleY = 150/590 = 15/59

Original BBOX ORIGINAL:

$x_{min} = 74$
$y_{min} = 192$
$x_{max} = 267$
$y_{max} = 499$

Scaling:

$x_{min} = 74 \times 3/13 = 17.08$
$y_{min} = 192 \times 15/59 = 48.81$
$x_{max} = 267 \times 3/13 = 61.62$
$y_{max} = 499 \times 15/59 = 126.86$

As you can see in your example, the values matched.

Resize number is Greater than the original Shape

What if the size of the image is less than the new size? Then we are gonna get bounding box coordinate outside of our image frame. What could be the solution in that case? Do we need to do the reverse of it like scale_x=image_size[1]/target_size?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.