dwofk / fast-depth Goto Github PK

View Code? Open in Web Editor NEW

925.0 925.0 189.0 5.44 MB

ICRA 2019 "FastDepth: Fast Monocular Depth Estimation on Embedded Systems"

License: MIT License

Python 100.00%

fast-depth's People

Contributors

Stargazers

Watchers

Forkers

ybyangjing simonsroad ml-lab dingziming neuzyy ningbende tjuuavlaboratory batermj capri2014 hustzxd shaoanlu satoshirobatofujimoto pubtools sunxincheng sareldu gtdong-ustc opencvfun tyunist stiphyjay ssgalitsky yiboliu31 jiaojiaozhang tmannen tuq820 zhenzhenxiang wf-hahaha kismeter psnow ixtiyoruz wpfhtl ray-mami zhangmaom rotorliu nmerrill67 wangteng1111 jiatianwu tamwaiban sunlex0717 459548764 firskey areslp hongbowei familiennameistchow itking666 tolleybot joyceya kuanghaofei nonlinear1 finleypan pean1128 houjiawei11 zebrajack qqq-tech guo-max antonlinderer avatarworld dongzhou-1996 ansonyanxin soldierofhell mfrdut weiniuzhu rikpires fwd3 gnefihs chwbo robotseye lot4fun xiaozhimabing usstdqq pranavraja99 abeerraj dmechea mentorezio ayush-j9 bishwasregmi riwaly xrosliang qiruizhang zeta1999 spider33 tuskaw peternara hwangkc appliedinnovation rahatsantosh lhb-sjtu baucheng yurzbeta shadimsaleh dbonattoj notion33 nakajimakou1 601480999 wavelens zgyh001 marsjtod giantliu22 keep9oing zackzhao1 namdinhrobotics

fast-depth's Issues

Different frameworks between pytorch and chainer in pruning

Hi, @dwofk @fangchangma
I have noticed that the pruning of fast-depth used Netadapt whose framework is in chainer. Did you transform the model first into a chainer model and then do prunning? Or is mobilenet pruned first and then combined with decoder as a pre-training model and pruned last?
I may not describe it very clearly, the two processes I want to say about pruning are:
fast-depth(pytorch)-->fast-depth(chainer)-->prunning
OR
mobilenet(chainer)-->prunning-->mobilenet-pruned(pytorch)-->+decoder-->training-->fast-depth(chainer)-->prunning
Maybe the second one is complex. Could you give me some advice? Thanks a lot!

NEW DATASET

Hello, I have been studying your paper recently. I've implemented replication, but I want to work with a new dataset. How do I do that? Nyudepthv2 this dataset contains .h5 files, but VRD contains .jpg files. Waiting for your reply.

KITTI odometry dataset

Is it possible to train the model with KITTI odometry dataset that have sparse GT depth image?

about smartphone demo

Hi, thank you for sharing this wonderful work.
On the project page, I notice there is a smartphone demo video, I wonder if the net using in that demo train only on NYU depth v2, or further using more data to train? If the demo's net using more data to train, what datasets you used?
Thank you.

Train on custom data

How can I train the model on custom data?

I wonder if this model could handle any size of input or resize into 224x224 as input?

The Pretrained MobileNet has no "model " or "best_result"

I used the pretrained model directly , but there is a bug exist , the dict of Pretrained MobileNet has no "model " or "best_result"

Size mismatch when using pruned model

Hello, I am new to pytorch and am using the fast depth for my research.
I planned to use the pruned network. But on loading the model i get an error "current model and checkpoint model for mismatcin size". This mismastch is due to difference the the dimension of the conv net of the pruned and model.py. Please let me know how I can solve this issue.

the model tar is broken

hi, thanks for your work
but when i download the pre-trained model, the tar file is broken and is not able to be unzipped.
how to fix it?

the mobilenet.pth.tar can not be extracted

After I downloaded the training models, I found the tar files can not be extracted. Could you upload the files again?

How to train on the NYU Depth v2 dataset?

Hi,
Congratulations for your nice work.

I´m trying to train the depth estimation model on NYU Depth V2.
Are there more preprocessing steps needed after downloading the dataset from http://datasets.lids.mit.edu/fastdepth/data/nyudepthv2.tar.gz?
or should I simply load the data from the 'train' directory and begin to train?

Thank you.

Weird image output from trained model

Hi Diana
After loading mobilenet-nnconv5-skipadd-pruned
The last layer "decode_conv6" output numbers between 1 to 5, whether I normalized my input image or not.

Its clearly not a normalized image and too bright for image between [0-255].
Not sure where I do wrong, or the output is not means for gray image?

How test it on KITTI？

hi，I really appreciate your this work.But how does it perform on the KITTI dataset？
Do you have the pre-trained model for KITTI？or can you tell me how to apply it on KITTI？
I look forward to your reply！
Good luck！

Problem in uncompressing pretrained models attached.

Hi,
The pretrained models given in this repo (http://datasets.lids.mit.edu/fastdepth/results/) are corrupted. Not able uncompress them after downloading.Please provide a fresh link to download the pretrained models.

Thanks in advance.

How to Improve the Accuracy of kitti Dataset

Dear sir
I training the model based on kitti data set, but the results is not good. Is there any way you can improve the accuracy?
Thank you.

Loss function

Thanks for this great work. I am currently trying to train fast-depth with my own dataset. I have noticed there is not training scripts. So I would like to ask, which depth losses are used in training?

It would be very nice, If anyone can give me a suggestion about which losses should I pick.

Evaluation error running the package "torch.device"

Hi, when i run the evaluation model with CUDA 9.2 and python 3, I got the following error:

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

I tried to solve this by changing the following line in the main code:

checkpoint = torch.load(args.evaluate)

to:

checkpoint = torch.load('modelpath', map_location=torch.device('cpu'))

After that, the current error is:

best_result = checkpoint['best_result']
KeyError: 'best_result'

Can anybody help me to solve this?

the pretrained module and trained modules are damaged！

Error: CUDAError: Check failed: ret == 0 (-1 vs. 0) : cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX

Hi!,

I get this error when I run the example from the Github description.

jetson@jetson:~/fast-depth/deploy$ python3 tx2_run_tvm.py --input-fp data/rgb.npy --output-fp data/pred.npy --model-dir ../results/tvm_compiled/tx2_gpu_mobilenet_nnconv5dw_skipadd_pruned/ --cuda True
=> [TVM on TX2] using model files in ../results/tvm_compiled/tx2_gpu_mobilenet_nnconv5dw_skipadd_pruned/
=> [TVM on TX2] loading model lib and ptx
=> [TVM on TX2] loading model graph and params
=> [TVM on TX2] creating TVM runtime module
=> [TVM on TX2] feeding inputs and params into TVM module
=> [TVM on TX2] running TVM module, saving output
Traceback (most recent call last):

  File "tx2_run_tvm.py", line 91, in <module>
    main()

  File "tx2_run_tvm.py", line 88, in main
    run_model(args.model_dir, args.input_fp, args.output_fp, args.warmup, args.run, args.cuda,  try_randin=args.randin)

  File "tx2_run_tvm.py", line 36, in run_model
    run() # not gmodule.run()

  File "/home/jetson/tvm/python/tvm/_ffi/_ctypes/function.py", line 207, in __call__
    raise get_last_ffi_error()

tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (3) /home/jetson/tvm/build/libtvm.so(TVMFuncCall+0x70) [0x7fad7ccec0]
  [bt] (2) /home/jetson/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::detail::PackFuncVoidAddr_<4, tvm::runtime::CUDAWrappedFunc>(tvm::runtime::CUDAWrappedFunc, std::vector<tvm::runtime::detail::ArgConvertCode, std::allocator<tvm::runtime::detail::ArgConvertCode> > const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0xe8) [0x7fad850b08]
  [bt] (1) /home/jetson/tvm/build/libtvm.so(tvm::runtime::CUDAWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, void**) const+0x6cc) [0x7fad85093c]
  [bt] (0) /home/jetson/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4c) [0x7facfdebac]
  File "/home/jetson/tvm/src/runtime/cuda/cuda_module.cc", line 110
  File "/home/jetson/tvm/src/runtime/library_module.cc", line 91
CUDAError: Check failed: ret == 0 (-1 vs. 0) : cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX

I can't seem to find a solution to this. Could you please help?

repeat test of the model with the same input is consistent

Hi
Thanks for this nice work. I am trying to reproduce this work on my machine.
what I observed is the output of the model is not consistent if I run multiple times of the inference in a while loop using the same input.

   while True:  
        image_cuda = torch.from_numpy(img).float().cuda()
        pred = 0
        print(pred)
        with torch.no_grad():
            pred = model(image_cuda)
            #np.save('pred.npy', pred.cpu())
        print(pred)

The output from the first iteration looks good. But after that, the output of each iteration is different from the output of other iterations even with the same input image (See below pic).
If I kill the thread and execute the code each time at the first iteration I always get the same output.

I print the pred values and find that it does differ from the previous iteration even with the same input image and the same model.

I did more tests about this and find this inconsistent happens at the laye#6 of Mobilenet in class MobileNetSkipAdd(nn.Module).
The #0~5 is always the same with the same input.

Is there anything I missed for using the model? or this a bug?

Where is the link of 'pretrained->model_best.pth.tar' in models.py

@fangchangma
Hi,
In your models.py file:
if pretrained:
pretrained_path = os.path.join('imagenet', 'results', 'imagenet.arch=mobilenet.lr=0.1.bs=256', 'model_best.pth.tar')
checkpoint = torch.load(pretrained_path)

The model_best.pth.tar is pretrained weight of Mobilenet?
Would you please give the link of 'model_best.pth.tar'. Thank you very much.

Pretrained encoder weights

Hi,

I'm wondering if you plan on releasing this file used to initialize training (I think):

imagenet/results/imagenet.arch=mobilenet.lr=0.1.bs=256/model_best.pth.tar

Or is this available somewhere else?

Thanks,
Daeyun

Implementing on PC

Hello

May I ask for a general direction on how can I implement this on PC. I vaguely understand the Pytorch model has to be compiled again... briefly how can i go about doing this?

Also, is there a way to run the Pytorch model from my PC without compiling it?

Thank you!!!

Problem running the evaluation code

Hi,
I'm trying to run FastDepth, in a ubuntu 18, with CUDA 9.1, pytorch 1.2. But with no success when running "python3 main.py --evaluate [path_to_trained_model]". i'm getting the following error message "transforms.py", line 337, in call
return misc.imresize(img, self.size, self.interpolation)
AttributeError: module 'scipy.misc' has no attribute 'imresize' "
So first, do you know which version of scipy do you use?
And, a tried all the pretrained models in "http://datasets.lids.mit.edu/fastdepth/results/" except for the tvm folder, but i don't know, which one can i use without an embedded system?

Generation of the .npy input file

Hi Diana,

Thank you for this interesting work.
Regarding the input image (input_fp) that you feed to the network in the tx2_run_tvm.py, I would like to know how you properly generate the .npy file (rgb.npy), please.

Thank you.

The prune model cannot reimplemented

I downloaded your pruned model and identified the model achieved delta1 >= 0.77.

I implemented the pruned model and trained it from the pretrained model of imagenet.
But its accuracy is around 0.6.
lr = 0.01,
weight decay = 0.0001
SGD with a momentum 0.9
=> That's is I didn't change these parameters at all.

Could you change those parameters while training using NetAdapt?

Weights [M] in the paper (FastDepth) is inconsistency with this codes

The Weights [M] of MobileNet-NNConv5 with depthwise & skip-add in the paper FastDepth: Fast Monocular Depth Estimation on Embedded Systems is 3.93. But I statistic the parameters is 3.96093 by the class MobileNetSkipAdd(nn.Module): in your code.

I want to konw is it a mistake in your paper?

when I reset the kernel size of NNConv5 by 3 instead 5, and the parameters is 3.929186 ≈ 3.93

the following code is used:
print('# generator parameters:', 1.0 * sum(param.numel() for param in model.parameters())/1000000)

No module named dataloaders.dataloader - Running Evaluation code

Hi, the problem is that when i run the Evaluation Code it appears:

Traceback (most recent call last):
File "main.py", line 15, in
args = utils.parse_command()
File "/[My_direction_path]/utils.py", line 15, in parse_command
from dataloaders.dataloader import MyDataloader
ImportError: No module named dataloaders.dataloader

I think the problem is in file linking. Can anybody help me?

the tar is broken

hi, thanks for your work.
but when i download the models it shows that the tar is broken.

Depth image to actual distance

How to get the actual distance in meters from the depth image?

Could you provide the training script ?

I want to train a model on kitti dataset, could you provide a training script? Thanks!

Cannot download the pretrained model

wget -r -np -nH --cut-dirs=2 --reject "index.html*" http://datasets.lids.mit.edu/fastdepth/results/

returns robots.txt.

I wonder if i could get some help with my own RGB input

I'm trying to test with my own inputs, but i'm not quite sure how to do it.
I thought it was in the dataloader.py code, but when i tried debugging it apparently this class is for the NYU dataset right?
If you could explain how to proper do it, it will be very helpful.

Thanks!

Would you release the training code and the pruning code?

Potential bug in data augmentation

Hi, I'm working on a TensorFlow implementation of Wofk et al. (2019), and I ran into something that I think may be a bug.

In train_transform(), the colors of the RGB image get jittered as follows:

rgb_np = self.color_jitter(rgb_np) # random color jittering

This calls the color_jitter() method, defined in NYUDataset's parent MyDataLoader:

color_jitter = transforms.ColorJitter(0.4, 0.4, 0.4)

Note that it passes three parameters.

ColorJitter is then defined as follows:

class ColorJitter(object):
    """Randomly change the brightness, contrast and saturation of an image.
    Args:
        brightness (float): How much to jitter brightness. brightness_factor
            is chosen uniformly from [max(0, 1 - brightness), 1 + brightness].
        contrast (float): How much to jitter contrast. contrast_factor
            is chosen uniformly from [max(0, 1 - contrast), 1 + contrast].
        saturation (float): How much to jitter saturation. saturation_factor
            is chosen uniformly from [max(0, 1 - saturation), 1 + saturation].
        hue(float): How much to jitter hue. hue_factor is chosen uniformly from
            [-hue, hue]. Should be >=0 and <= 0.5.
    """
    def __init__(self, brightness=0, contrast=0, saturation=0, hue=0):
        self.brightness = brightness
        self.contrast = contrast
        self.saturation = saturation
        self.hue = hue

The fourth parameter, hue, is not passed, so it defaults to zero. This seems odd; so I was wondering if this is intended behavior or a bug?

Problem when try running evaluation

Hi, nice work.
I'm trying run Evaluation code with CUDA 9.1, pytorch 1.2, but I'm getting the following error:

FileNotFoundError: [Errno 2] No such file or directory: '../data/nyudepthv2/val'

NP arrays seems to be having issue

Hi. I am currently trying to implement the Fastdepth algorithm via OpenCV pipeline! Pardon me as I am quite new to computer vision.

import cv2
import tx2_run_tvm_realtime_test
import tx2_run_tvm_realtime
import visualize_test
import visualize
import matplotlib as mp
import time
import numpy as np

cap = cv2.VideoCapture("/dev/video1") # check this

while True:

start = time.time()

#read and RGB input
#ret, img = cap.read()
img = cv2.imread('rgb.png', 1)
original_img = img
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
np.save('rgb2', img)
 
 
#print('Original Dimensions : ',img.shape)

#make the dimensions 224 by 224 for Fast Depth to work 
dim = (224, 224)

resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
#resized = img
 

#run the modified FD library function to get output npy array 
output = tx2_run_tvm_realtime.run_model("rgb.npy", "resize2.npy", 0) #this default npy file works
#output = tx2_run_tvm_realtime.run_model("rgb2.npy", "resize2.npy", 0) #this does not work

#convert the npy array to a proper image array format using the modified visualize library
visualize.save_pred_image('resize2.npy', 'resizedf2.png')
#print(type(output_img))

done = time.time()
elapsed = done - start
print("FPS: " + str(1/elapsed))

if cv2.waitKey(1) & 0xFF == ord('q'):
    	break

When everything done, release the capture

cap.release()
cv2.destroyAllWindows()

Don't mind the loop and all, they are for me to imnplement on a videofeed in the future... The main issue is that when I implement the fast depth algorithms using the npy file provided by the developers (rgb.npy) I am able to get the nice output.

However, when I use my npy file converted from a rgb.png file using opencv, the output is less than ideal. I have attached the 2 diff outputs.

Am I doing smth wrong with the conversion? I thought np.save should convert the png file to npy nicely and this should work...

when i depoy the net the error occurs

=> [TVM on TX2] using model files in ../tvm_compile/tx2_cpu_mobilenet_nnconv5dw_skipadd_pruned/
=> [TVM on TX2] loading model lib and ptx
Traceback (most recent call last):
File "tx2_run_tvm.py", line 91, in
main()
File "tx2_run_tvm.py", line 88, in main
run_model(args.model_dir, args.input_fp, args.output_fp, args.warmup, args.run, args.cuda, try_randin=args.randin)
File "tx2_run_tvm.py", line 13, in run_model
loaded_lib = tvm.module.load(os.path.join(model_dir, "deploy_lib.o"))
File "/home/lvhao/tvm/python/tvm/module.py", line 216, in load
_cc.create_shared(path + ".so", path)
File "/home/lvhao/tvm/python/tvm/contrib/cc.py", line 33, in create_shared
_linux_shared(output, objects, options, cc)
File "/home/lvhao/tvm/python/tvm/contrib/cc.py", line 58, in _linux_shared
raise RuntimeError(msg)
RuntimeError: Compilation error:
/usr/bin/ld: ../tvm_compile/tx2_cpu_mobilenet_nnconv5dw_skipadd_pruned/deploy_lib.o：普通ELF重定位（M: 183）
/usr/bin/ld: ../tvm_compile/tx2_cpu_mobilenet_nnconv5dw_skipadd_pruned/deploy_lib.o：普通ELF重定位（M: 183）
../tvm_compile/tx2_cpu_mobilenet_nnconv5dw_skipadd_pruned/deploy_lib.o: 无法添加符号: 文件格式错误
collect2: error: ld returned 1 exit status

The links of the data and model can not be opened!!

I can't open the links of the data and model as follows:
http://datasets.lids.mit.edu/fastdepth/data/nyudepthv2.tar.gz
http://datasets.lids.mit.edu/fastdepth/results/

How long to pretrained in ImageNet

Hi guys
How long to pretrained the model based on ImageNet?
Hope to receive your answer.
Thank you.

Tuning for TX2

Hello,
I used auto-tuning from TVM and run the inference which runs at 60 FPS ONLY. However, it was very fast mentioned in the paper.
Can I get the code for tuning from TVM?

How can it work for Raspberry pi?

Sorry, I'm new using pytorch and tvm so maybe I understand things wrong...

As I know all models in results/tvm_compiled/ are compiled for Jetson TX2.

What do I need to run a compiled model in a Raspberry pi as in tx2_run_tvm.py ?

I guess I have to Compile the model using tvm as explained in this tutorial:
https://docs.tvm.ai/tutorials/frontend/from_pytorch.html

Can anyone confirm?

Do you have the plan to opensource the training code?

#2
I am also interested whether will you release the training code?

Performance of the model on mobile/browser devices

I tried converting the .pt (torch) model to both .onnx and tfjs formats.
To correspondingly deploy them on browser as well on a node server (on CPU).

And the inference speeds average around 1500-1700 ms?

At the same time I found an iOS example on fastdepth.github.io which averages to an excellent 40 fps.

Am I missing anything on my browser/cpu implementations? Any additional processing to be done?
Thanks

Got some bad pixels in the pred depth

closed.

How to get the absolute distance from the camera to the object based on depth map?

Hi,everyone.
I saw how to draw a depth map from these line in source code:

But does the parameter "depth" represent absolute distance from the camera to the object?And what is the unit of distance?
If I'm wrong, how can I get the absolute distance?
Thank you a lot~

module scipy.misc has no attribute imresize

Hi, imresize is deprecated or removed.
You may consider changing it using Pillow instead: numpy.array(Image.fromarray(arr).resize()).
The error happens in transforms.py ,line 338

The version of TVM used in fast-depth and model transformation pipline

Does mobilenet-nnconv5dw-skipadd-pruned.pth.tar has been optimized or compiled by Specific version of TVM?
For deploying, when i use TVM0.6, there will be an error with the function "get_nums_output", i have checked that in TVM0.6 and there's an extra function "get_nums_output". But when i use TVM0.4, error occurs:
"Operator {} not implemented.".format(op_name)) NotImplementedError: Operator Upsample not implemented.
Probably these all due to TVM version.
In addition,i use the pipeline to deploy:
mobilenet-nnconv5dw-skipadd-pruned.pth.tar-->onnx model-->TVM deploy model
Is there any advice to model transformation?

the metrics RMSE

hi,thanks for sharing such a interesting work,
when i train and validate the FastDepth acording to your method,the metric RMSE is always about 600,far from your ~0.60 in your paper,i wonder if there is anything worng in your method of caculating RMSE but i can't find it out