Coder Social home page Coder Social logo

imageflow's Introduction

ImageFlow

Welcome to the ImageFlow wiki!

Overview

ImageFlow is a simple program written in Python to compute the optical flow between two images. We are using it to facilitate our research.

The images are obtained from the Unreal Engine and the AirSim package. Each image should be associated with a camera pose and a depth map. The user specifies the input information through a JSON file. The outputs will be written to folders which are arranged in a pattern similar to the filenames of the input images.

Example usage

The data is arranged in such a way that it consists of several different files and folders. We design it to fit our needs, so maybe it is a little bit complex.

The user needs

  • A JSON file as the top-level input (File A)
  • A JSON file holds the "names" of the images as a list (File B)
  • A camera pose file in NumPy binary format (.npy) (File C)
  • A folder contains the images (Folder A)
  • A folder contains the depth information (Folder B)

Input JSON (File A)

Inputs are fed into ImageFlow through a JSON file. The user could find a sample input file with the name “IFInput.json” at the root of this package. The content of IFInput.json is really straightforward and detailed definitions are as follows:

  • dataDir: A string represents the folder for the data.
  • startingIdx: An integer represents the starting index in the list recorded in "File B".
  • idxStep: An integer represents the moving step in the list recorded in "File B".
  • idxNumberRequest: An integer represents the requested number of steps to be performed. The user could specify a un-reasonable large number to just let the program process all the images starting from "startingIdx" with a step size of "idxStep".
  • poseFilename: A string represents the filename of File B.
  • poseName: A string represents the variable name of the list recorded in File B. File B contains only one list. poseName is the variable name of that very list.
  • poseData: A string represents the filename of File C.
  • outDir: A string shows the output directory.
  • imageDir: A string shows the image directory, Folder A.
  • imageSuffix: A string represents the filename suffix of the input image file.
  • imageExt: A string represents the extension part of the filename of an image file, including the period in front of the extension.
  • depthDir: A string shows the directory which contains all the depth files, Folder B.
  • depthSuffix: A string represents a suffix in the filenames of the depth files. We were generating these depth files and naming them after each image. So there is a uniform suffix for each depth file.
  • depthExt: A string represents the extension part of the filename of a depth file, including the period in front of the extension.
  • camera: A dictionary like parameter which contains the focal length and the image size. All the values are in pixel units.
  • imageMagnitudeFactor: A floating point number represents the magnification factor for the output optical flow image. Sometimes the camera poses are so similar to each other that the optical flow image looks very dark. We just use this parameter to boost up the brightness of the optical flow image. All the raw result data written to the file system is NOT scaled. When the user is using the --mf command from the command line, this imageMagnitudeFactor gets overwritten.
  • iamgeWaitTimeMS: An integer specifies the amount of time that the program waits while it is showing the resultant optical flow image on the screen. The unit is millisecond.
  • distanceRange: An integer represents a threshold distance from the camera. This value is used when the user wants ImageFlow to generate 3D point clouds for detailed inspection. The point clouds will be generated in the PLY format, which could be imported by software like the MeshLab. Point clouds are only generated if the user specifies --debug on the command line.
  • flagDegree: True or False. The pixel movement will be computed as moving direction and magnitude. For the direction, the user could choose use degree as the unit for the saved result and the optical flow image. Or the user could choose radian as the unit. Specify True for degree unit.
  • warpErrorThreshold: The Threshold for evaluating the warp error. This is the RGB differences between the pixels of the second image and the warped first image. The difference is calculated in a sense of average over all valid warped pixels. A difference higher than this threshold is considered as 'over threshold' and will be reported at the end of execution.

Image name file (File B)

The user needs a JSON file which holds the "names" of the images as a list. This JSON file contains only one parameter. The identifier of this parameter is specified by "poseName" in File A. The value of the parameter must be a list of strings. These strings are the actual names for the input images. A sample File B could be found in the sample folder.

Camera pose file (File C)

The camera pose file is a NumPy binary file that contains each camera pose for the input images. The user will get a NumPy 2D array after loading this file by numpy.load(). In this array, every row represents a pose entry. A pose entry consists of two components, the 3D position vector of the camera and the orientation of the camera. An entry has 7 columns, the first three is the position vector, the remaining 4 columns represent a quaternion which in turn represents a rotation. The last element of the quaternion is the scale factor.

The user is responsible for keeping the correspondence between the rows in the NumPy array and the list defined in File B. To be more specific, the order of the names listed in File B must be the same with the row order in File C.

Depth files

Each depth file is a binary NumPy data file generated by numpy.save(). Once loaded by numpy.load(), the array has the same size of the associated image but only one channel. The plane depth of the Unreal Engine should be used here.

Folders

All the folders specified in File A except "dataDir" are relative paths. They are relative to "dataDir".

Command line arguments

The user could invoke ImageFlow.py and specify some command line arguments in the meantime. There are three command line arguments.

  • --input <input_filename>: The filename of File A, with its full path or relative path. If this argument is not specified in the command line, a default input file with a filename of "IFInput.json" will be used.
  • --mf <magnitude_factor>: The magnification factor at runtime the user specified to overwrite "imageMagnitudeFactor" in File A.
  • --debug: Use this argument to make ImageFlow to output all debugging data into "outDir" with the 3D point clouds included.
  • --np: Number of threads deployed to carry out the whole process.

Outputs

There are some text outputs on the terminal. These outputs are mainly for user reference. PLEASE NOTE: The t0 vector and the R0 matrix, as well as t1 and R1 are NOT the values originally found in File C. These output values are the converted translation vector and rotation matrix for coordinate transformation from the world coordinate system to the camera reference frame. For example, let's define a point in the world coordinate system as {xw}, and its coordinate with respect to the camera frame, {xc}. Then we have {xc} = [R] {xw} + {t}.

The other output text on the terminal is for progress tracking and final summary.

After a normal operation of ImageFlow without the --debug command line argument, some result files are written to the file system. Optical flow is calculated between two images taking the first image as the reference. The pixel movement is described by its moving angle and moving distance (measured in pixel). All of the above data are written into different files in folder <dataDir> \ <outDir> \ ImageFlow:

  • a.dat: The moving angle for an individual pixel. Using degree or radian as the unit specified by File A.
  • d.dat: The moving magnitude for an individual pixel.
  • u.dat: The x-axis pixel location of a pixel in the first image observed in the second image.
  • v.dat: The y-axis (downwards) pixel location of a pixel in the first image observed in the second image.
  • du.dat: The x-axis change from the first image with respect to u.dat.
  • dv.dat: The y-axis change from the first image with respect to v.dat.
  • ad.npy: Combines the data in a.dat and d.dat into a 2-channel matrix.
  • dudy.dat: Combines the data in du.dat and dv.dat into a 2-channel matrix.
  • bgr.jpg: A color image as the optical flow image.

The warped images are written back to <dataDir> \ <imageDir> for easy comparision with the original input images. Along with the warped images, there are addtional files containing the warp error evaluation. Since these output files go to the image directory, this may lead to problem if the user wants to run ImageFlow.py multiple times with the same input images. Please use RemoveOutputs.py to delete all results from the input image directory. Refer to the code or issue python RemoveOutputs.py --help for usage of the script.

Some 3D point clouds will be written if --debug command line argument is specified. These point clouds are used for debugging purpose, however, the user a welcome to use them anyway. These files are defined as follows:

  • XInCam_0.ply: What's the objective looks like in the reference frame of camera 0 (the first camera).
  • XInCam_1.ply: What's the objective looks like in the reference frame of camera 1 (the second camera).
  • XInWorld_0.ply: What's the objective looks like in the world frame from the perspective of camera 0 (the first camera).
  • XInWorld_1.ply: What's the same objective looks like in the world frame from the perspective of camera 1 (the second camera).

For all the point clouds, the vertex color is defined by the distance from the camera center, the red the further, the blue the nearer. "distanceRange" in File A is a threshold (measured in meter) that any point beyond this distance will be omitted in the PLY file.

imageflow's People

Contributors

yyhulive avatar huyaoyu avatar yaoyuh-cmu avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.