Coder Social home page Coder Social logo

data-preparation's Introduction

Data-Preparation

All the scripts related to data preparation.

                                        ********** rename.py *************

This Script is used to Maintain Uniform Annotation Label in XML Files Across the Dataset which is Annotated Using LabelImg Software Tool

For eg : Person X annotated a car as "CAR" person Y annotated a car as "cAR" , then this code will convert all Annotations to "car" irrespective of Perosn X and Y without any Hassle.

To run this Script : Place this rename.py in the Folder which has Image and Annotations together and set the path "arr" to this current Directory and Run

                                     ********** duplicate_remover.py ***********

This Script is Used for Removing Unannotated Data done using LabelImg or LabelMe Tool . In Many Images there may not be objects to annotate , that image will not have its corresponding .XML File , hence it may contribute as a Negative Weight while Training Data. So to prevent that this code is written , which removes all Unannotated Images from the directory .

For eg : A folder contains [ image1.jpg , image1.xml , image2.jpg , image3.jpg , image3.xml] Here image2 is unannotated , so it does not have its corresponding xml . After running this script the directory content is given as follows [ image1.jpg , image1.xml , image3.jpg , image3.xml] , here we see image2.jpg is removed .

To run this script : image_directory stores the cureent directory of this file location , to run a script on a specific folder change 'day more' to the folder name and Run this script

                                      ************* file_renamer.py **************

This script is to rename the Files and serialize them accordingly so as to make the Dataset Clean and Organized for Training

For eg: A folder contains [ image1.jpg , image1.xml , image3.jpg , image3.xml , image11.jpg ,image11.xml ] . Here image2 ,....image10 is missing / removed by duplicate_remover.py script , now this data is not Arranged , if we run this file_renamer.py script on this folder then the folder content is [ image_1.jpg , image_1.xml , image_2.jpg , image_2.xml , image_3.jpg ,image_3.xml ] , so here all data is arranged and renamed in ascending order like image3.jpg got renamed to image_2.jpg and so on

To run this script: image_directory stores the cureent directory of this file location , change 'day more' to the folder name on which it will be applied , and mention from where to start counting by setting the start value to count_n variable and run

                                    ************* Order to Run the Script **************
                                    
                                       1. Run the rename.py on a Directory 
                                       2. Then Run duplicate_remover.py on the Directory
                                       3. Then Run file_renamer.py on the Directory

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.