kakaxi314 / guidenet Goto Github PK
View Code? Open in Web Editor NEWImplementation for our paper 'Learning Guided Convolutional Network for Depth Completion'
License: MIT License
Implementation for our paper 'Learning Guided Convolutional Network for Depth Completion'
License: MIT License
Hello! I notice that the released code didn't include the method for training and evaluating the NYU depth v2 dataset. So how to get the results? Or can anyone tell me? Thank you very much !
Thank you for your nice work.
Since the code is not yet open, I write down my question for your kernel design.
In the paper, given the image feature information, you set this feature as the convolution weight.
My question is when the size of the batch of the input images is larger than 1, how did you compute the convolution operation?
In pytorch, kernel shape of CNN is [out_channels, in_channels, kernel_size, kernel_size] which means that the same kernels are applied to different multi-batch tensor (e.g. img.shape = [batch, channel, height, width]).
So I guess that you may split the multi-batch-tensor into single-batch-tensor for computation... or did you personally re-implement the convolution operation??..
Hi, thank you for the great work. I'm wondering when the code will be uploaded.
As the title said, since you creat the repo, 7 mon have passed, plz share the code, thks
Are there some private issue not releasing the code? The last sentence in the paper is just for passing the review?
Thank you for your great project.
When I compile this project, it shows:
unable to execute ':/usr/local/cuda/bin/nvcc': No such file or directory
error: command ':/usr/local/cuda/bin/nvcc' failed with exit status 1
my environment is: ubuntu18.04, python3.6.9, pytorch-1.4.0, cuda10.0. When I run other projects on the same platform, it is ok.
I don't know why this issue happens, could you show me your environment settings please? Thank you very much!
Deat author,
I try to reimplement your GuidedConv based on your paper.
But I found the training process is very unstable.
Here is my pytorch implementation:
class _GuideConv(nn.Module):
def __init__(self, img_in, sparse_in, sparse_out, K=3):
super(_GuideConv, self).__init__()
# KGL: kernel generating layer
self.conv1 = nn.Conv2d(img_in, K*K*sparse_in, kernel_size=3,
stride=1, padding=1, groups=1, bias=False)
self.bn1 = nn.BatchNorm2d(K*K*sparse_in)
self.fc = nn.Linear(img_in, sparse_in * sparse_out, bias=False)
self.bn2 = nn.BatchNorm2d(sparse_out)
self.img_in = img_in
self.sparse_in = sparse_in
self.sparse_out = sparse_out
def forward(self, G, S):
'''
G: input guidance
S: input source feature
'''
# spatially-variant
W1 = self.conv1(G) # [B,Cin*K*K,H,W]
W1 = self.bn1(W1)
depths = torch.chunk(S, self.sparse_in, 1) # Cin*[B,1,H,W]
kernels = torch.chunk(W1, self.sparse_in, 1) # Cin*[B,K*K,H,W]
S1 = []
for i in range(self.sparse_in):
S1.append(torch.sum(depths[i]*kernels[i], 1, keepdim=True))
S1 = torch.cat(S1, 1) # [B,Cin,H,W]
# cross-channel conv
W2 = F.adaptive_avg_pool2d(G, (1, 1))
B = W2.size(0)
W2 = W2.reshape(B, -1) # (b,img_in)
W2 = self.fc(W2)
W2 = W2.view([B, self.sparse_out, self.sparse_in])
depths = torch.chunk(S1, B, 0) # B*[1,Cin,H,W]
kernels = torch.chunk(W2, B, 0) # B*[1,Cout,Cin]
S2 = []
for i in range(B):
weight = kernels[i][0].unsqueeze(-1).unsqueeze(-1) # [Cout,Cin,1,1]
S2.append(F.conv2d(depths[i], weight, bias=None, stride=1, padding=0))
S2 = torch.cat(S2, 0) # [B,Cin,H,W]
S2 = F.relu(self.bn2(S2))
return S2
Could you give me some advice? or just open source this part of code? 🎉
How effective is your code on the indoor dataset? Can it be used to fill holes left by depth cameras?When will your code be released?
I guess the source code for this repo would never be released, and since people are confused by "Guided Conv Module" in the paper, I'm going to share my naive PyTorch implementation based on CSPN code.
Note: This implementation was based on my own understanding of the paper, I'm not sure if it's correct. Besides, I got worse result with this module compared to ordinary concat/add fusion method.
Let me know if you have any questions or suggestions.
class _GuidedConv(nn.Module):
def __init__(self):
super(_GuidedConv, self).__init__()
self.pad_left_top = nn.ZeroPad2d((1, 0, 1, 0))
self.pad_center_top = nn.ZeroPad2d((0, 0, 1, 0))
self.pad_right_top = nn.ZeroPad2d((0, 1, 1, 0))
self.pad_left_middle = nn.ZeroPad2d((1, 0, 0, 0))
self.pad_right_middel = nn.ZeroPad2d((0, 1, 0, 0))
self.pad_left_bottom = nn.ZeroPad2d((1, 0, 0, 1))
self.pad_center_bottom = nn.ZeroPad2d((0, 0, 0, 1))
self.pad_right_bottom = nn.ZeroPad2d((0, 1, 0, 1))
def forward(self, x, cw: list, cc: list):
"""
`x`: input feature maps with size `[B, C_in, H, W]`
`cw`: `C_in` channel-wise kernels, each with size [B, 3*3, H, W]
`cc`: `C_out` cross-channel 1*1 kernels, each with size [B, C_in]
"""
# stage-1: weight x with kernels in `cw`
tmp = []
for i in range(len(cw)):
feat = self._compose_feat(x[:, i, :, :].unsqueeze_(1))
feat *= cw[i]
tmp.append(torch.sum(feat, dim=1, keepdim=True))
tmp = torch.cat(tmp, dim=1) # [B, C_in, H, W]
# stage-2: weight tmp with kernels in `cc`
out = []
for i in range(len(cc)):
weight = cc[i].unsqueeze_(-1).unsqueeze_(-1)
out.append(torch.sum(tmp * weight, dim=1, keepdim=True))
return torch.cat(out, dim=1) # [B, C_out, H, W]
def _compose_feat(self, feat: torch.FloatTensor):
[H, W] = feat.shape[2:]
output = [feat]
# left-top
output.append(self.pad_left_top(feat)[:, :, :H, :W])
# center-top
output.append(self.pad_center_top(feat)[:, :, :H, :])
# right-top
output.append(self.pad_right_top(feat)[:, :, :H, 1:])
# left-middle
output.append(self.pad_left_middle(feat)[:, :, :, :W])
# right-middle
output.append(self.pad_right_middel(feat)[:, :, :, 1:])
# left-bottom
output.append(self.pad_left_bottom(feat)[:, :, 1:, :W])
# center-bottom
output.append(self.pad_center_bottom(feat)[:, :, 1:, :])
# right-bottom
output.append(self.pad_right_bottom(feat)[:, :, 1:, 1:])
# concat
output = torch.cat(output, dim=1) # [B, 3*3, H, W]
return output
Hi,
In your paper I noticed that you have done some ablation studies on the Virtual KITTI dataset.
You mentioned that you used the sparse depth masks corresponding to the frame in the original KITTI dataset to generate sparse depth maps for virtual kitti.
Would be great if you can share the code for this matching of frames between KITTI and virtual KITTI.
Thanks,
Hardik
Thanks to the author for his good work and open source code!
Since the core code is written in cuda which is difficult to understand for people who are not familiar with cuda programming. Can someone provide the code for pytorch?
Thanks!
As in the default setting, training with 2 v100 gpus on kitti, how long does it takes?
Hi, recently I've been through your great work, I would like to train the network but with less Kitti_Raw data, is that possible? as all know Kitti_raw dataset is very huge, and out of 5 dates of data, I have 3 dates. like 2011_09_26 , 2011_09_28 and 2011_09_30. if so please guide me on how to reduce Kitt-raw data for training. Thanks in advance!
....
Thank you for your interesting work and planning to release your code!
I understand that you plan to release the code after getting a review of the TIP submission.
Meanwhile, I'm trying to implement your method from the descriptions in the paper, but some of important information is missing to fully understand the method. Could you help me by answering my questions below?
Thank you.
Hi,
Thanks for releasing the code.
I find that you feed both original maps and flipped maps into the network to predict the dense depth map in the code. I wonder whether the submitted results are obtained through the same strategy.
Best
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.