Here, we provide simple tools of pre-processing for NYUd v2 dataset, as the the NYUd v2 dataset's author only provide the original dumped data collected by Kinect. When we apply monocular depth estimation on NYU-d v2 dataset, we shoule generate the RGB image and dense depth map ourself, the process method is as follows.
These code are tested on Ubuntu 16.04 LTS with MATLAB 2015b and Python2.7.
-
Download the raw data of NYU-d v2 dataset, which more than
400G
, please make sure that you have enough disk space availabel. Then extract them into the directorynyud_raw_data
. At the same time, download theToolbox
from the same url above, and extract it. -
The dataset is divided into
590
folders which correspond to eachscene
being filmed, such asliving_room_0012
. The file is structured as follows:
/
../bedroom_0001/
../bedroom_0001/a-1294886363.011060-3164794231.dump
../bedroom_0001/a-1294886363.016801-3164794231.dump
...
../bedroom_0001/d-1294886362.665769-3143255701.pgm
../bedroom_0001/d-1294886362.793814-3151264321.pgm
...
../bedroom_0001/r-1294886362.238178-3118787619.ppm
../bedroom_0001/r-1294886362.814111-3152792506.ppm
- Files that begin with the prefix
a-
are the accelerometer dumps. Files that begin with the prefixr-
andd-
are the frames from the RGB and depth cameras, respectively. You can useget_synched_frames.m
function in the Toolbox to find the matching relationship betweenrgb image
anddepth map
.
-
Put the script
process_raw.m
and theToolbox
into dirnyud_raw_data
as mentioned above. -
Modify the
savePath
andstride
, which thesavePath
termed the output path and thestride
control the number of output files. The default value ofstride
is1
, which will save all images. -
Open matlab under
tmux
for the sake of long processing time and run the scriptprocess_raw.m
.
Sample results are as follows:
rgb image with resolution of 480 * 640
:
dense depth image with the same resolution of rgb image:
- Tips: For better training, I save the dense depth of RGB image with the data format of 16 bit, so the value of depth map is between
0
and65535
, as defined at:
imgDepth = imgDepth / 10.0 * 65535.0
imgDepth = uint16(imgDepth)
imwrite(imgDepth, outDepthFilename, 'png', 'bitdepth', 16);
- You can also save them with the format of
8bit
which limit the value of depth map between0
and255
, just change the script above.
In general, we can also generate a thin dataset which has 1449
images totally. 795
is for training and 654
is for testing.
- Firstly, you should download the thin dataset integrated into one
.mat
file from the same url above, namedLabeled dataset (~2.8 GB)
, then extract it to getnyu_depth_v2_labeled.mat
which contents:
accelData: [1449×4 single]
depths: [480×640×1449 single]
images: [480×640×3×1449 uint8]
instances: [480×640×1449 uint8]
labels: [480×640×1449 uint16]
names: {894×1 cell}
namesToIds: [894×1 containers.Map]
rawDepthFilenames: {1449×1 cell}
rawDepths: [480×640×1449 single]
rawRgbFilenames: {1449×1 cell}
sceneTypes: {1449×1 cell}
scenes: {1449×1 cell}
-
Run
save16bitdepth.m
to save the16bit
dense depth map of1449
images, while theRGB
images can be obtained by directly save from theimages
attributes ofnyu_depth_v2_labeled.mat
. -
Run
nyud_split.py
to split1449
images totest
andtrain
subset for practical application. You should change the variables refer toPATH
as you wish.