dwofk / fast-depth Goto Github PK
View Code? Open in Web Editor NEWICRA 2019 "FastDepth: Fast Monocular Depth Estimation on Embedded Systems"
License: MIT License
ICRA 2019 "FastDepth: Fast Monocular Depth Estimation on Embedded Systems"
License: MIT License
Hi, @dwofk @fangchangma
I have noticed that the pruning of fast-depth used Netadapt whose framework is in chainer. Did you transform the model first into a chainer model and then do prunning? Or is mobilenet pruned first and then combined with decoder as a pre-training model and pruned last?
I may not describe it very clearly, the two processes I want to say about pruning are:
fast-depth(pytorch)-->fast-depth(chainer)-->prunning
OR
mobilenet(chainer)-->prunning-->mobilenet-pruned(pytorch)-->+decoder-->training-->fast-depth(chainer)-->prunning
Maybe the second one is complex. Could you give me some advice? Thanks a lot!
Hello, I have been studying your paper recently. I've implemented replication, but I want to work with a new dataset. How do I do that? Nyudepthv2 this dataset contains .h5 files, but VRD contains .jpg files. Waiting for your reply.
Is it possible to train the model with KITTI odometry dataset that have sparse GT depth image?
Hi, thank you for sharing this wonderful work.
On the project page, I notice there is a smartphone demo video, I wonder if the net using in that demo train only on NYU depth v2, or further using more data to train? If the demo's net using more data to train, what datasets you used?
Thank you.
How can I train the model on custom data?
I used the pretrained model directly , but there is a bug exist , the dict of Pretrained MobileNet has no "model " or "best_result"
Hello, I am new to pytorch and am using the fast depth for my research.
I planned to use the pruned network. But on loading the model i get an error "current model and checkpoint model for mismatcin size". This mismastch is due to difference the the dimension of the conv net of the pruned and model.py. Please let me know how I can solve this issue.
hi, thanks for your work
but when i download the pre-trained model, the tar file is broken and is not able to be unzipped.
how to fix it?
After I downloaded the training models, I found the tar files can not be extracted. Could you upload the files again?
Hi,
Congratulations for your nice work.
I´m trying to train the depth estimation model on NYU Depth V2.
Are there more preprocessing steps needed after downloading the dataset from http://datasets.lids.mit.edu/fastdepth/data/nyudepthv2.tar.gz?
or should I simply load the data from the 'train' directory and begin to train?
Thank you.
Hi Diana
After loading mobilenet-nnconv5-skipadd-pruned
The last layer "decode_conv6" output numbers between 1 to 5, whether I normalized my input image or not.
Its clearly not a normalized image and too bright for image between [0-255].
Not sure where I do wrong, or the output is not means for gray image?
hi,I really appreciate your this work.But how does it perform on the KITTI dataset?
Do you have the pre-trained model for KITTI?or can you tell me how to apply it on KITTI?
I look forward to your reply!
Good luck!
Hi,
The pretrained models given in this repo (http://datasets.lids.mit.edu/fastdepth/results/) are corrupted. Not able uncompress them after downloading.Please provide a fresh link to download the pretrained models.
Thanks in advance.
Dear sir
I training the model based on kitti data set, but the results is not good. Is there any way you can improve the accuracy?
Thank you.
Thanks for this great work. I am currently trying to train fast-depth with my own dataset. I have noticed there is not training scripts. So I would like to ask, which depth losses are used in training?
It would be very nice, If anyone can give me a suggestion about which losses should I pick.
Hi, when i run the evaluation model with CUDA 9.2 and python 3, I got the following error:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
I tried to solve this by changing the following line in the main code:
checkpoint = torch.load(args.evaluate)
to:
checkpoint = torch.load('modelpath', map_location=torch.device('cpu'))
After that, the current error is:
best_result = checkpoint['best_result']
KeyError: 'best_result'
Can anybody help me to solve this?
Hi!,
I get this error when I run the example from the Github description.
jetson@jetson:~/fast-depth/deploy$ python3 tx2_run_tvm.py --input-fp data/rgb.npy --output-fp data/pred.npy --model-dir ../results/tvm_compiled/tx2_gpu_mobilenet_nnconv5dw_skipadd_pruned/ --cuda True
=> [TVM on TX2] using model files in ../results/tvm_compiled/tx2_gpu_mobilenet_nnconv5dw_skipadd_pruned/
=> [TVM on TX2] loading model lib and ptx
=> [TVM on TX2] loading model graph and params
=> [TVM on TX2] creating TVM runtime module
=> [TVM on TX2] feeding inputs and params into TVM module
=> [TVM on TX2] running TVM module, saving output
Traceback (most recent call last):
File "tx2_run_tvm.py", line 91, in <module>
main()
File "tx2_run_tvm.py", line 88, in main
run_model(args.model_dir, args.input_fp, args.output_fp, args.warmup, args.run, args.cuda, try_randin=args.randin)
File "tx2_run_tvm.py", line 36, in run_model
run() # not gmodule.run()
File "/home/jetson/tvm/python/tvm/_ffi/_ctypes/function.py", line 207, in __call__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (3) /home/jetson/tvm/build/libtvm.so(TVMFuncCall+0x70) [0x7fad7ccec0]
[bt] (2) /home/jetson/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::detail::PackFuncVoidAddr_<4, tvm::runtime::CUDAWrappedFunc>(tvm::runtime::CUDAWrappedFunc, std::vector<tvm::runtime::detail::ArgConvertCode, std::allocator<tvm::runtime::detail::ArgConvertCode> > const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0xe8) [0x7fad850b08]
[bt] (1) /home/jetson/tvm/build/libtvm.so(tvm::runtime::CUDAWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, void**) const+0x6cc) [0x7fad85093c]
[bt] (0) /home/jetson/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4c) [0x7facfdebac]
File "/home/jetson/tvm/src/runtime/cuda/cuda_module.cc", line 110
File "/home/jetson/tvm/src/runtime/library_module.cc", line 91
CUDAError: Check failed: ret == 0 (-1 vs. 0) : cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX
I can't seem to find a solution to this. Could you please help?
Hi
Thanks for this nice work. I am trying to reproduce this work on my machine.
what I observed is the output of the model is not consistent if I run multiple times of the inference in a while loop using the same input.
while True:
image_cuda = torch.from_numpy(img).float().cuda()
pred = 0
print(pred)
with torch.no_grad():
pred = model(image_cuda)
#np.save('pred.npy', pred.cpu())
print(pred)
The output from the first iteration looks good. But after that, the output of each iteration is different from the output of other iterations even with the same input image (See below pic).
If I kill the thread and execute the code each time at the first iteration I always get the same output.
I print the pred values and find that it does differ from the previous iteration even with the same input image and the same model.
I did more tests about this and find this inconsistent happens at the laye#6 of Mobilenet in class MobileNetSkipAdd(nn.Module).
The #0~5 is always the same with the same input.
Is there anything I missed for using the model? or this a bug?
@fangchangma
Hi,
In your models.py file:
if pretrained:
pretrained_path = os.path.join('imagenet', 'results', 'imagenet.arch=mobilenet.lr=0.1.bs=256', 'model_best.pth.tar')
checkpoint = torch.load(pretrained_path)
The model_best.pth.tar is pretrained weight of Mobilenet?
Would you please give the link of 'model_best.pth.tar'. Thank you very much.
Hi,
I'm wondering if you plan on releasing this file used to initialize training (I think):
imagenet/results/imagenet.arch=mobilenet.lr=0.1.bs=256/model_best.pth.tar
Or is this available somewhere else?
Thanks,
Daeyun
Hello
May I ask for a general direction on how can I implement this on PC. I vaguely understand the Pytorch model has to be compiled again... briefly how can i go about doing this?
Also, is there a way to run the Pytorch model from my PC without compiling it?
Thank you!!!
Hi,
I'm trying to run FastDepth, in a ubuntu 18, with CUDA 9.1, pytorch 1.2. But with no success when running "python3 main.py --evaluate [path_to_trained_model]". i'm getting the following error message "transforms.py", line 337, in call
return misc.imresize(img, self.size, self.interpolation)
AttributeError: module 'scipy.misc' has no attribute 'imresize' "
So first, do you know which version of scipy do you use?
And, a tried all the pretrained models in "http://datasets.lids.mit.edu/fastdepth/results/" except for the tvm folder, but i don't know, which one can i use without an embedded system?
Hi Diana,
Thank you for this interesting work.
Regarding the input image (input_fp) that you feed to the network in the tx2_run_tvm.py, I would like to know how you properly generate the .npy file (rgb.npy), please.
Thank you.
I downloaded your pruned model and identified the model achieved delta1 >= 0.77.
I implemented the pruned model and trained it from the pretrained model of imagenet.
But its accuracy is around 0.6.
lr = 0.01,
weight decay = 0.0001
SGD with a momentum 0.9
=> That's is I didn't change these parameters at all.
Could you change those parameters while training using NetAdapt?
The Weights [M] of MobileNet-NNConv5 with depthwise & skip-add in the paper FastDepth: Fast Monocular Depth Estimation on Embedded Systems is 3.93. But I statistic the parameters is 3.96093 by the class MobileNetSkipAdd(nn.Module): in your code.
I want to konw is it a mistake in your paper?
when I reset the kernel size of NNConv5 by 3 instead 5, and the parameters is 3.929186 ≈ 3.93
the following code is used:
print('# generator parameters:', 1.0 * sum(param.numel() for param in model.parameters())/1000000)
Hi, the problem is that when i run the Evaluation Code it appears:
Traceback (most recent call last):
File "main.py", line 15, in
args = utils.parse_command()
File "/[My_direction_path]/utils.py", line 15, in parse_command
from dataloaders.dataloader import MyDataloader
ImportError: No module named dataloaders.dataloader
I think the problem is in file linking. Can anybody help me?
hi, thanks for your work.
but when i download the models it shows that the tar is broken.
How to get the actual distance in meters from the depth image?
I want to train a model on kitti dataset, could you provide a training script? Thanks!
wget -r -np -nH --cut-dirs=2 --reject "index.html*" http://datasets.lids.mit.edu/fastdepth/results/
returns robots.txt
.
I'm trying to test with my own inputs, but i'm not quite sure how to do it.
I thought it was in the dataloader.py code, but when i tried debugging it apparently this class is for the NYU dataset right?
If you could explain how to proper do it, it will be very helpful.
Thanks!
Hi, I'm working on a TensorFlow implementation of Wofk et al. (2019), and I ran into something that I think may be a bug.
In train_transform()
, the colors of the RGB image get jittered as follows:
rgb_np = self.color_jitter(rgb_np) # random color jittering
This calls the color_jitter()
method, defined in NYUDataset
's parent MyDataLoader
:
color_jitter = transforms.ColorJitter(0.4, 0.4, 0.4)
Note that it passes three parameters.
ColorJitter
is then defined as follows:
class ColorJitter(object):
"""Randomly change the brightness, contrast and saturation of an image.
Args:
brightness (float): How much to jitter brightness. brightness_factor
is chosen uniformly from [max(0, 1 - brightness), 1 + brightness].
contrast (float): How much to jitter contrast. contrast_factor
is chosen uniformly from [max(0, 1 - contrast), 1 + contrast].
saturation (float): How much to jitter saturation. saturation_factor
is chosen uniformly from [max(0, 1 - saturation), 1 + saturation].
hue(float): How much to jitter hue. hue_factor is chosen uniformly from
[-hue, hue]. Should be >=0 and <= 0.5.
"""
def __init__(self, brightness=0, contrast=0, saturation=0, hue=0):
self.brightness = brightness
self.contrast = contrast
self.saturation = saturation
self.hue = hue
The fourth parameter, hue
, is not passed, so it defaults to zero. This seems odd; so I was wondering if this is intended behavior or a bug?
Hi, nice work.
I'm trying run Evaluation code with CUDA 9.1, pytorch 1.2, but I'm getting the following error:
FileNotFoundError: [Errno 2] No such file or directory: '../data/nyudepthv2/val'
Hi. I am currently trying to implement the Fastdepth algorithm via OpenCV pipeline! Pardon me as I am quite new to computer vision.
import cv2
import tx2_run_tvm_realtime_test
import tx2_run_tvm_realtime
import visualize_test
import visualize
import matplotlib as mp
import time
import numpy as np
cap = cv2.VideoCapture("/dev/video1") # check this
while True:
start = time.time()
#read and RGB input
#ret, img = cap.read()
img = cv2.imread('rgb.png', 1)
original_img = img
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
np.save('rgb2', img)
#print('Original Dimensions : ',img.shape)
#make the dimensions 224 by 224 for Fast Depth to work
dim = (224, 224)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
#resized = img
#run the modified FD library function to get output npy array
output = tx2_run_tvm_realtime.run_model("rgb.npy", "resize2.npy", 0) #this default npy file works
#output = tx2_run_tvm_realtime.run_model("rgb2.npy", "resize2.npy", 0) #this does not work
#convert the npy array to a proper image array format using the modified visualize library
visualize.save_pred_image('resize2.npy', 'resizedf2.png')
#print(type(output_img))
done = time.time()
elapsed = done - start
print("FPS: " + str(1/elapsed))
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Don't mind the loop and all, they are for me to imnplement on a videofeed in the future... The main issue is that when I implement the fast depth algorithms using the npy file provided by the developers (rgb.npy) I am able to get the nice output.
However, when I use my npy file converted from a rgb.png file using opencv, the output is less than ideal. I have attached the 2 diff outputs.
Am I doing smth wrong with the conversion? I thought np.save should convert the png file to npy nicely and this should work...
=> [TVM on TX2] using model files in ../tvm_compile/tx2_cpu_mobilenet_nnconv5dw_skipadd_pruned/
=> [TVM on TX2] loading model lib and ptx
Traceback (most recent call last):
File "tx2_run_tvm.py", line 91, in
main()
File "tx2_run_tvm.py", line 88, in main
run_model(args.model_dir, args.input_fp, args.output_fp, args.warmup, args.run, args.cuda, try_randin=args.randin)
File "tx2_run_tvm.py", line 13, in run_model
loaded_lib = tvm.module.load(os.path.join(model_dir, "deploy_lib.o"))
File "/home/lvhao/tvm/python/tvm/module.py", line 216, in load
_cc.create_shared(path + ".so", path)
File "/home/lvhao/tvm/python/tvm/contrib/cc.py", line 33, in create_shared
_linux_shared(output, objects, options, cc)
File "/home/lvhao/tvm/python/tvm/contrib/cc.py", line 58, in _linux_shared
raise RuntimeError(msg)
RuntimeError: Compilation error:
/usr/bin/ld: ../tvm_compile/tx2_cpu_mobilenet_nnconv5dw_skipadd_pruned/deploy_lib.o:普通ELF重定位(M: 183)
/usr/bin/ld: ../tvm_compile/tx2_cpu_mobilenet_nnconv5dw_skipadd_pruned/deploy_lib.o:普通ELF重定位(M: 183)
../tvm_compile/tx2_cpu_mobilenet_nnconv5dw_skipadd_pruned/deploy_lib.o: 无法添加符号: 文件格式错误
collect2: error: ld returned 1 exit status
I can't open the links of the data and model as follows:
http://datasets.lids.mit.edu/fastdepth/data/nyudepthv2.tar.gz
http://datasets.lids.mit.edu/fastdepth/results/
Hi guys
How long to pretrained the model based on ImageNet?
Hope to receive your answer.
Thank you.
Hello,
I used auto-tuning from TVM and run the inference which runs at 60 FPS ONLY. However, it was very fast mentioned in the paper.
Can I get the code for tuning from TVM?
Sorry, I'm new using pytorch and tvm so maybe I understand things wrong...
As I know all models in results/tvm_compiled/ are compiled for Jetson TX2.
What do I need to run a compiled model in a Raspberry pi as in tx2_run_tvm.py ?
I guess I have to Compile the model using tvm as explained in this tutorial:
https://docs.tvm.ai/tutorials/frontend/from_pytorch.html
Can anyone confirm?
#2
I am also interested whether will you release the training code?
I tried converting the .pt (torch) model to both .onnx and tfjs formats.
To correspondingly deploy them on browser as well on a node server (on CPU).
And the inference speeds average around 1500-1700 ms?
At the same time I found an iOS example on fastdepth.github.io which averages to an excellent 40 fps.
Am I missing anything on my browser/cpu implementations? Any additional processing to be done?
Thanks
closed.
Hi, imresize is deprecated or removed.
You may consider changing it using Pillow instead: numpy.array(Image.fromarray(arr).resize()).
The error happens in transforms.py ,line 338
Does mobilenet-nnconv5dw-skipadd-pruned.pth.tar
has been optimized or compiled by Specific version of TVM?
For deploying, when i use TVM0.6, there will be an error with the function "get_nums_output", i have checked that in TVM0.6 and there's an extra function "get_nums_output". But when i use TVM0.4, error occurs:
"Operator {} not implemented.".format(op_name)) NotImplementedError: Operator Upsample not implemented.
Probably these all due to TVM version.
In addition,i use the pipeline to deploy:
mobilenet-nnconv5dw-skipadd-pruned.pth.tar-->onnx model-->TVM deploy model
Is there any advice to model transformation?
hi,thanks for sharing such a interesting work,
when i train and validate the FastDepth acording to your method,the metric RMSE is always about 600,far from your ~0.60 in your paper,i wonder if there is anything worng in your method of caculating RMSE but i can't find it out
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.