ialhashim / densedepth Goto Github PK
View Code? Open in Web Editor NEWHigh Quality Monocular Depth Estimation via Transfer Learning
Home Page: https://arxiv.org/abs/1812.11941
License: GNU General Public License v3.0
High Quality Monocular Depth Estimation via Transfer Learning
Home Page: https://arxiv.org/abs/1812.11941
License: GNU General Public License v3.0
When I try to run the demo, I get the error:
AttributeError: 'GLWidget' object has no attribute 'UNIFORM_LOCATIONS'
I think I have everything installed correctly, as the demo still pops up.
Pyside2 installation: Version: 5.13.0
PyOpenGL installation: Version: 3.1.0
Hello @ialhashim @cclauss I want to perform SLAM with predicted depth, As you have show 3D image so could we perform SLAM from single images using your demo.py.
Secondly that 3D map is generated from single image or sequence of images.
Thank you.
Do you use a single learning rate (0.0001) at all experiments? If I train the network too much iterations, e.g., 100 epoch on NYUDepthv2 and then 100 epoch on KITTI, will the model forgets its pretraining on ImageNet?
Hello, thank you for your implementation! I have a few questions regarding PyTorch:
Hi,
Thank you for sharing your code. I think there might be a small bug in your tensorflow 2.0 dataloader where you shuffle in a wrong place. The way you shuffle makes all epochs the same permutation. I think you need to have a shuffle before/after repeat.
PS . The shuffle in model.fit wont have any effects as explained in the keras documentation here
Suggested code:
def get_batched_dataset(self, batch_size):
self.dataset = tf.data.Dataset.from_tensor_slices((self.filenames, self.labels))
self.dataset = self.dataset.shuffle(shuffle_buffer_size=len(self.filenames))
self.dataset = self.dataset.repeat()
self.dataset = self.dataset.map(map_func=self._parse_function, num_parallel_calls=tf.data.experimental.AUTOTUNE)
self.dataset = self.dataset.batch(batch_size=batch_size)
return dataset
I was also wondering if pytorch code and tensorflow 2.0 give the same result as keras. Do you have any evaluation on those two codes compared with the original keras?
Thank you,
Ali
Hello @ialhashim @cclauss Hope you are doing well, I must appreciate good work done by you.
I am implementing your work in py charm and getting error
AttributeError: module 'glm' has no attribute 'vec3'
All the code with glm. is giving error, I am not sure what is the reason because I have installed glm with pip version 0.3.4.
Could you help me with this,
Thanks in advance.....
Thank you for the code. I am planning to finetune the NYUv2 pretrained model on my own kinect dataset. I have a question about the pre-processing done to the depth ground truth data.
From my understanding, your NYUv2 keras model takes 480x640 images (each pixel a 0-255 uint8 value) as input depth ground truth data(from the ~4 GB nyu-data.zip files). However, your python re-implementation of the depth infilling method gives as output depth values between 0 - 1 float.
It would be great if you can let me know how did you exactly convert the 0-1 float to 0-255 uint. Did you directly multiply by 255 and convert to uint8? Or did you something like,
255*((depth_infilling_output - Min(depth_infilling_output))/max(depth_infilling_output))
Thank you
Hi, I have a few questions, mostly regarding maxdepth:
The maxdepth value seems to be set to 1000 throughout the code, however I noticed that the depth values in the NYU dataset range all the way up to 10000. Should the maxdepth value be changed to 10000 when training NYU?
When making predictions, the predicted depth values can be higher than any values seen in the training set. Do you know why this is so? Do I need to normalize it to the range of the training data as a post processing step?
Lastly, when training NYU I am noticing that the validation loss does not decrease at all, but the training loss does steadily. Any ideas why this might be happening?
Thanks!
Hi, I was hoping you could provide some guidance on training with a custom dataset. I am attempting to train on only a small datatset (20 images) with the goal of overfitting to those 20 images to ensure that it has a high enough capacity.
Using the NYU model as the starting checkpoint, I have been able to train with loss steadily decreasing. However, running test.py
on the training images gives incorrect results. (even after 500 epochs of training)
Something else I was wondering about is why the loss
and val_loss
have such different values if the training and validation sets are set to be identical. For example I am seeing training loss of about ~.05 compared to val_loss of 24. Could you provide any insight?
Hello,
I want to train my own data.
if args.data == 'nyu': train_generator, test_generator = get_nyu_train_test_data( args.bs )
How to modify the code?
Sincerely,
HuBoni
I have used it with Yolo, to measure the distance but the results are wrong. Any help would be appreciated?
last_time = time.time()
ret, frame = cap.read()
detections = darknet_video.detect_from_image(frame)
original_height, original_width , _ = np.shape(frame)
detections = darknet_video.resize_detections(detections,rgb_height, rgb_width)
frame = cv2.resize(frame, (rgb_width, rgb_height ), interpolation=cv2.INTER_CUBIC)
# Compute results
outputs = predict(model, np.expand_dims(frame/256,0)) * 80
disp_resized = cv2.resize(outputs[0], (rgb_width ,rgb_height ), interpolation=cv2.INTER_CUBIC)
#pos = posFromDepth(disp_resized)
#print(np.shape(pos))
print(np.shape(disp_resized), np.shape(frame))
detections = get_min_disp_dets(disp_resized, detections,rgb_height , rgb_width )
colormapped_im = darknet_video.cvDrawBoxes_with_distance(detections, frame)
#print(1./(time.time() - last_time))
cv2.imshow('image', colormapped_im)
cv2.imshow('image1', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
have you test about smaller image size input?I wonder if a smaller size of input would make results worse?(for example if the input is 224x224,output is 112x112,)
Currently running densedepth through docker image(provided below) and receive error when running test.py after downloading training set. Any idea what could be going on?
ImportError: Cannot load backend 'TkAgg' which requires the 'tk' interactive framework, as 'headless' is currently running
OS:Ubuntu 16.04
docker image:
https://hub.docker.com/r/aragorn1025/dense_depth
Hi,
Thanks for sharing your code.
I'm trying to train on KITTI and after reading all the preceding questions and answers I still have two questions:
The original depth data in KITTI is 16 bit. After division by 256 you get values in meters.
After running the fill_depth_colorization.py script the max range of the output depth is [0 85].
What did you do with the zeros? Should I just change them to 0? (This is what I currently do).
Should I first scale the data to the range [0,255]? (Because this is, as far as I understand, the assumed range in the function NYU_BasicAugmentRGBSequence:
y = np.clip(np.asarray(Image.open( BytesIO(self.data[sample[1]]) )).reshape(480,640,1)/255*self.maxDepth,0,self.maxDepth)
In NYU_BasicRGBSequence - why do you divide by 10.0 in this line:
y = np.asarray(Image.open(BytesIO(self.data[sample[1]])), dtype=np.float32).reshape(480,640,1).copy().astype(float) / 10.0
Should I change it to 80.0 for KITTI?
Thanks,
Ofer
when I try to training pytorch with all your setting in origin python implementations,it seems loss
decrease when begin but wave at val loss about 0.19x_ after about 3-4 epoch training is it normal?
or the pytorch implemention depends greatly on initialize? thank you
Hello @ialhashim I wanted to ask about unit of depth measured, Is that in meters or mm...
You are clipping it between 10 and 1000, what arre these two values...
In short I want the output of depth in meters(float), How could I get that...
Thanks
Hi, thank you for making an effort!
I have a question, how to pull the depth value for each pixel?
Hi,Ibraheem.
First, I want to say this is a very amazing algorithm that help me a lot,thank you!
I know some depth values of the raw NYU depth map (in meter) is missing which need to be filled by the Matlab code provided with NYU Depth V2 toolbox.
Now, I have a question, after filled, how to convert the depth in meter to gray image (extracted from the 4.1G zip that you provided)?
depth_gray = int(depth_in_meter / max(depth_in_meter) * 255)? Or other coverting way?
Hi , I want to know how to calculate the real world distance value from the camera to object. And I am using my own image so which mean pretrained model trained on different camera config. So how can I calibrate this pretrained model to my own camera input image.
Thanks for advance
Hello, I used test.py to test the indoor photos I took, but the output depth map had poor effect, and even the opposite situation occurred. How can I improve his accuracy?Thank you.
Hi,
Nice work! we find it work consistently well with indoor and outdoor scenes.
I wonder how you compute the surface normal from depth map in fig. 4 of the paper?
Thanks!
Can you tell me how to get the real depth information through the output of the model. Is the output equal to the disparity map? For example, if I use the trained model to test a CitySpace picture whose size is 2048x1024. Should I resize the output(1024x512) to 2048x1024 first, and then use the formula :
depth = 0.209313 * 2262.52/(2048 * disparity)
to get the depth map?
Thanks for your brilliant work.
First of all thank you very much for your contribution. One question: Have you used the KITTI raw or the KITTI depth dataset as your basis for inpainting?
While the README.md states "KITTI: copy the raw data", this issue indicates that KITTI depth was used...
Hi, Ibraheem, question again : )
I tried to convert sparse depth map to dense depth map using your's python version of fill_depth_colorization.m in NYU v2 dataset toolkits. However, I found something strange.
my test code is as following:
from fill_depth_colorization import *
from PIL import Image
if name == 'main':
p1 = 'kitti_raw_data/2011_09_26/2011_09_26_drive_0001_sync/image_02/data/0000000005.png'
p2 = 'train/2011_09_26_drive_0001_sync/proj_depth/groundtruth/image_02/0000000005.png'
img1 = np.array(Image.open(p1))
img2 = np.array(Image.open(p2))
dense = fill_depth_colorization(img1, img2)
print((img2.min(), img2.max()), (dense.min(), dense.max()))
output: (0, 21733) (1365.0, 21733.0)
It seems the minimal value is changed to non-zero (0 to 1365.0) after the inpainting method. I am not sure whether this is normal.
I noticed that you use the reciprocal of the depth, so non-zero is a must?
Thank you.
When I run test.py,It has this error,I have consulted for a long time and still don't know how to solve it,Can you help me for this?Thank you very much. : )
File "/home/hpc/cx/DenseDepth-master/test.py", line 38, in
viz = display_images(outputs.copy(), inputs.copy())
File "/home/hpc/cx/DenseDepth-master/utils.py", line 80, in display_images
return skimage.util.montage(all_images, multichannel=True, fill=(0,0,0))
AttributeError: module 'skimage.util' has no attribute 'montage'
Hi, thank you for releasing your work - the results look excellent!
I am looking at trying to process videos and getting quite a lot of flickering across different frames. I was wondering if you have any suggestion on what might help with getting more consistency across frames.
The literature (mainly in style transfer) seems to be pointing towards ConvLTSM as the choice to achieve better consistency
Hi
I wanted to ask if there's any way to make the final datasets (especially the inpainted KITTI dataset) available?
Thanks for the paper & code.
Hello ~ Could you please list the version of all the required packages in order to run with PyTorch. I think the version of PyTorch is the most important.
Thanks a lot!
def DepthNorm(x, maxDepth):
return maxDepth / x
Shouldn't it be x/maxDepth
?
Hi,
Thanks for your great contribution, I have great interest. Now i have two question as follows.
Firstly, about the network architecture, why not continue to upsample to get the depth map of 640X480 resolution as the same as the input images?
Next, why you define the target depth map y as y=m/y(orig).
Thank you for your reply.
Hi, thanks for your nice paper and code, however, I get an error when training on nyu dataset without any other changes on windows.
---- error logs ----
Epoch 1/20
191/6336 [..............................] - ETA: 2:07:04 - loss: 0.2413Traceback (most recent call last):
File "train.py", line 87, in
model.fit_generator(train_generator, callbacks=callbacks, validation_data=test_generator, epochs=args.epochs, shuffle=True)
File "C:\Python\Python36\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Python\Python36\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "C:\Python\Python36\lib\site-packages\keras\engine\training_generator.py", line 181, in fit_generator
generator_output = next(output_generator)
File "C:\Python\Python36\lib\site-packages\keras\utils\data_utils.py", line 601, in get
six.reraise(*sys.exc_info())
File "C:\Python\Python36\lib\site-packages\six.py", line 693, in reraise
raise value
File "C:\Python\Python36\lib\site-packages\keras\utils\data_utils.py", line 595, in get
inputs = self.queue.get(block=True).get()
File "C:\Python\Python36\lib\multiprocessing\pool.py", line 644, in get
raise self._value
File "C:\Python\Python36\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(args, **kwds))
File "C:\Python\Python36\lib\site-packages\keras\utils\data_utils.py", line 401, in get_index
return _SHARED_SEQUENCES[uid][i]
File "E:\Development\code\DenseDepth\data.py", line 70, in getitem
y = np.clip(np.asarray(Image.open( BytesIO(self.data[sample[1]]) )).reshape(480,640,1)/255self.maxDepth,0,self.maxDepth)
ValueError: cannot reshape array of size 1 into shape (480,640,1)
----system information----
os: windows 10
tensorflow: 1.13.1
keras: 2.2.4
Hello,
is there any explanation for the lack of activation function between the two successive convolution in the upproject function ? This seems a bit unusual for me. Thanks !
In the evaluate function of the utils module, you multiply your predictions with 10.0 after scaling the depth map. Why are you doing this? I can't find any information about this in the code or in the paper.
I forked this repo, downloaded the pretrained weights, installed the dependencies and ran python3 test.py with the following error message. I am on MacOS, Python 3.7 with Anaconda, and CPU only (no GPU).
/anaconda3/envs/python37_tf/bin/python /Users/glennjocher/PycharmProjects/DenseDepth/test.py
Using TensorFlow backend.
Loading model...
WARNING:tensorflow:From /anaconda3/envs/python37_tf/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Model loaded (nyu.h5).
Loaded (12) images of size (480, 640, 3).
2019-05-08 19:40:38.960 python[8618:2061327] -[NSApplication _setup:]: unrecognized selector sent to instance 0x7fbaafaecbf0
2019-05-08 19:40:38.962 python[8618:2061327] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[NSApplication _setup:]: unrecognized selector sent to instance 0x7fbaafaecbf0'
*** First throw call stack:
(
0 CoreFoundation 0x00007fff38e57cf9 __exceptionPreprocess + 256
1 libobjc.A.dylib 0x00007fff639eca17 objc_exception_throw + 48
2 CoreFoundation 0x00007fff38ed1b06 -[NSObject(NSObject) __retain_OA] + 0
3 CoreFoundation 0x00007fff38dfa0ef ___forwarding___ + 1485
4 CoreFoundation 0x00007fff38df9a98 _CF_forwarding_prep_0 + 120
5 libtk8.6.dylib 0x0000000b35fdc31d TkpInit + 413
6 libtk8.6.dylib 0x0000000b35f3417e Initialize + 2622
7 _tkinter.cpython-37m-darwin.so 0x0000000b31feda0f _tkinter_create + 1183
8 python 0x0000000100de2116 _PyMethodDef_RawFastCallKeywords + 230
9 python 0x0000000100f1ee42 call_function + 306
10 python 0x0000000100f1caec _PyEval_EvalFrameDefault + 46092
11 python 0x0000000100f1049e _PyEval_EvalCodeWithName + 414
12 python 0x0000000100de0de7 _PyFunction_FastCallDict + 231
13 python 0x0000000100e63381 slot_tp_init + 193
14 python 0x0000000100e6d361 type_call + 241
15 python 0x0000000100de1ae3 _PyObject_FastCallKeywords + 179
16 python 0x0000000100f1eed5 call_function + 453
17 python 0x0000000100f1cbe0 _PyEval_EvalFrameDefault + 46336
18 python 0x0000000100de18d5 function_code_fastcall + 117
19 python 0x0000000100f1edc7 call_function + 183
20 python 0x0000000100f1caec _PyEval_EvalFrameDefault + 46092
21 python 0x0000000100f1049e _PyEval_EvalCodeWithName + 414
22 python 0x0000000100de0de7 _PyFunction_FastCallDict + 231
23 python 0x0000000100de4ce2 method_call + 130
24 python 0x0000000100de2752 PyObject_Call + 130
25 python 0x0000000100f1cd58 _PyEval_EvalFrameDefault + 46712
26 python 0x0000000100f1049e _PyEval_EvalCodeWithName + 414
27 python 0x0000000100de1fe3 _PyFunction_FastCallKeywords + 195
28 python 0x0000000100f1edc7 call_function + 183
29 python 0x0000000100f1cbe0 _PyEval_EvalFrameDefault + 46336
30 python 0x0000000100f1049e _PyEval_EvalCodeWithName + 414
31 python 0x0000000100f739a0 PyRun_FileExFlags + 256
32 python 0x0000000100f72e17 PyRun_SimpleFileExFlags + 391
33 python 0x0000000100fa0d3f pymain_main + 9663
34 python 0x0000000100db466d main + 125
35 libdyld.dylib 0x00007fff6521a3d5 start + 1
36 ??? 0x0000000000000002 0x0 + 2
)
libc++abi.dylib: terminating with uncaught exception of type NSException
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)
Can you provide the modified data.py code for KITTI dataset ?
Hi, when training on a custom dataset I am getting an error when it tries to save the model checkpoint with keras.callbacks.ModelCheckpoint
. The error is raise TypeError('Not JSON Serializable: %s' % (obj,)) TypeError: Not JSON Serializable: ?
Any thoughts?
I verified that test.py works with the images in the examples folder. After this I created a new folder called 'images', placed one image inside and ran python3 test.py --input 'images/*.jpg'
Using TensorFlow backend.
Loading model...
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Model loaded (nyu.h5).
Loaded (1) images of size (4032, 3024, 3).
Traceback (most recent call last):
File "test.py", line 34, in <module>
outputs = predict(model, inputs)
File "/content/DenseDepth/utils.py", line 12, in predict
predictions = model.predict(images, batch_size=batch_size)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1169, in predict
steps=steps)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py", line 294, in predict_loop
batch_outs = f(ins_batch)
File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,1664,252,188] vs. shape[1] = [1,256,252,189]
[[{{node up1_concat/concat}}]]
[[{{node conv3/BiasAdd}}]]
Done (44.03s)
The depth maps generated by test.py are currently in RGB format. How do I get them to be in "I" mode, Intensity format. https://pillow.readthedocs.io/en/5.1.x/handbook/concepts.html#concept-modes
Hi,
Great work, really impressive!
I would like to use the pre-trained model (on the KITTI dataset) on images from the Drive360 dataset and I was wondering if you had any suggestions on how to best prepare those images to get the best possible prediction using the DenseDepth model.
Here's an example image:
its shape is: (1080, 1920, 3) whereas the ones in the KITTI dataset are (if I'm not mistaken): (370, 1224, 3).
My initial thought was to crop the 1080x1920 images to extract central 370x1224 rectangle -- what dou you think?
In NYU_BasicAugmentRGBSequence class in data.py, the depth image pixel values are divided by 255 and multiplied by maxDepth. Why are the depth values divided by 255 ? I haven't used NYU depth dataset. Is that how the depth is given in NYU Depth dataset ?
I have trained the network using my kitti and nyu dataset. And my saved model has size of 344 mb. But the downloaded pretrained model only weights half the size. ie,172 mb. Why that difference?
Hi, I want to know how do I test on KITTI? I download the kitti model and I change the test code but it has a bug.
Hi,
What crop values did you use for the evaluation on KITTI dataset (in Table 2 in your paper)?
Thanks,
Ofer
I get this error while running
python demo.py
Traceback (most recent call last):
File "demo.py", line 414, in <module>
window = Window()
File "demo.py", line 80, in __init__
self.glWidget = GLWidget()
File "demo.py", line 238, in __init__
self.updateRGBD()
File "demo.py", line 367, in updateRGBD
self.pos = self.pos + glm.vec3(0, -0.06, -0.3)
AttributeError: module 'glm' has no attribute 'vec3'
I tried glm versions 0.1.0 and latest one as well.
Thanks for your great contribution.
I am doing experiments on my own data recently. The model works well. But one problem is that there is large gap(meters) between the "ground truth" data(dense depth generated by "fill_depth_colorization.py") and real depth(I see this after I project pixels with depth to 3d word, and compare the point cloud with the lidar data.)
So, I decide to try some other depth completion methods. But before that, I wonder why you chose "Colorization using optimization" as the method in your model? Any important reason?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.