Coder Social home page Coder Social logo

jorditruji / nyud2-toolkit Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wangq95/nyud2-toolkit

0.0 1.0 0.0 824 KB

Tools and tips to process NYUd v2 dataset for Monocular depth estimation task.

License: GNU General Public License v3.0

Python 24.58% Objective-C 44.68% MATLAB 30.74%

nyud2-toolkit's Introduction

NYUd2 Toolkit

Here, we provide simple tools of pre-processing for NYUd v2 dataset, as the the NYUd v2 dataset's author only provide the original dumped data collected by Kinect. When we apply monocular depth estimation on NYU-d v2 dataset, we shoule generate the RGB image and dense depth map ourself, the process method is as follows.

Requirements

These code are tested on Ubuntu 16.04 LTS with MATLAB 2015b and Python2.7.

Dataset preparation

  1. Download the raw data of NYU-d v2 dataset, which more than 400G, please make sure that you have enough disk space availabel. Then extract them into the directory nyud_raw_data. At the same time, download the Toolbox from the same url above, and extract it.

  2. The dataset is divided into 590 folders which correspond to each scene being filmed, such as living_room_0012. The file is structured as follows:

/
../bedroom_0001/
../bedroom_0001/a-1294886363.011060-3164794231.dump
../bedroom_0001/a-1294886363.016801-3164794231.dump
                  ...
../bedroom_0001/d-1294886362.665769-3143255701.pgm
../bedroom_0001/d-1294886362.793814-3151264321.pgm
                  ...
../bedroom_0001/r-1294886362.238178-3118787619.ppm
../bedroom_0001/r-1294886362.814111-3152792506.ppm
  • Files that begin with the prefix a- are the accelerometer dumps. Files that begin with the prefix r- and d- are the frames from the RGB and depth cameras, respectively. You can use get_synched_frames.m function in the Toolbox to find the matching relationship between rgb image and depth map.

Generate the RGB and Dense Depth map

  1. Put the script process_raw.m and the Toolbox into dir nyud_raw_data as mentioned above.

  2. Modify the savePath and stride, which the savePath termed the output path and the stride control the number of output files. The default value of stride is 1, which will save all images.

  3. Open matlab under tmux for the sake of long processing time and run the script process_raw.m.

Sample results are as follows:

rgb image with resolution of 480 * 640: rgb

dense depth image with the same resolution of rgb image: depth

  • Tips: For better training, I save the dense depth of RGB image with the data format of 16 bit, so the value of depth map is between 0 and 65535, as defined at:
imgDepth = imgDepth / 10.0 * 65535.0
imgDepth = uint16(imgDepth)
imwrite(imgDepth, outDepthFilename, 'png', 'bitdepth', 16);
  • You can also save them with the format of 8bit which limit the value of depth map between 0 and 255, just change the script above.

Generate the NYU-d v2 thin dataset

In general, we can also generate a thin dataset which has 1449 images totally. 795 is for training and 654 is for testing.

  1. Firstly, you should download the thin dataset integrated into one .mat file from the same url above, named Labeled dataset (~2.8 GB), then extract it to get nyu_depth_v2_labeled.mat which contents:
accelData:         [1449×4 single]
depths:            [480×640×1449 single]
images:            [480×640×3×1449 uint8]
instances:         [480×640×1449 uint8]
labels:            [480×640×1449 uint16]
names:             {894×1 cell}
namesToIds:        [894×1 containers.Map]
rawDepthFilenames: {1449×1 cell}
rawDepths:         [480×640×1449 single]
rawRgbFilenames:   {1449×1 cell}
sceneTypes:        {1449×1 cell}
scenes:            {1449×1 cell}
  1. Run save16bitdepth.m to save the 16bit dense depth map of 1449 images, while the RGB images can be obtained by directly save from the images attributes of nyu_depth_v2_labeled.mat.

  2. Run nyud_split.py to split 1449 images to test and train subset for practical application. You should change the variables refer to PATH as you wish.

nyud2-toolkit's People

Contributors

wangq95 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.