Coder Social home page Coder Social logo

guidenet's People

Contributors

kakaxi314 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

guidenet's Issues

Hope to get your reply

Hello, thank you very much for your contribution. When I test your model, I only have loss. I want to ask you how to get the results of RMSE in the paper

2021-11-03_23-00

About the NYU training and evaluation

Hello! I notice that the released code didn't include the method for training and evaluating the NYU depth v2 dataset. So how to get the results? Or can anyone tell me? Thank you very much !

Question for the kernel design in multi-batch input.

Thank you for your nice work.
Since the code is not yet open, I write down my question for your kernel design.

In the paper, given the image feature information, you set this feature as the convolution weight.
My question is when the size of the batch of the input images is larger than 1, how did you compute the convolution operation?

In pytorch, kernel shape of CNN is [out_channels, in_channels, kernel_size, kernel_size] which means that the same kernels are applied to different multi-batch tensor (e.g. img.shape = [batch, channel, height, width]).
So I guess that you may split the multi-batch-tensor into single-batch-tensor for computation... or did you personally re-implement the convolution operation??..

Compile error

Thank you for your great project.
When I compile this project, it shows:
unable to execute ':/usr/local/cuda/bin/nvcc': No such file or directory
error: command ':/usr/local/cuda/bin/nvcc' failed with exit status 1

my environment is: ubuntu18.04, python3.6.9, pytorch-1.4.0, cuda10.0. When I run other projects on the same platform, it is ok.
I don't know why this issue happens, could you show me your environment settings please? Thank you very much!

Help for my GuidedConv Implementation

Deat author,
I try to reimplement your GuidedConv based on your paper.
But I found the training process is very unstable.
Here is my pytorch implementation:

class _GuideConv(nn.Module):
    def __init__(self, img_in, sparse_in, sparse_out, K=3):
        super(_GuideConv, self).__init__()
        # KGL: kernel generating layer
        self.conv1 = nn.Conv2d(img_in, K*K*sparse_in, kernel_size=3,
                               stride=1, padding=1, groups=1, bias=False)
        self.bn1 =  nn.BatchNorm2d(K*K*sparse_in)

        self.fc = nn.Linear(img_in, sparse_in * sparse_out, bias=False)
        self.bn2 = nn.BatchNorm2d(sparse_out)

        self.img_in = img_in
        self.sparse_in = sparse_in
        self.sparse_out = sparse_out

    def forward(self, G, S):
        '''
        G: input guidance
        S: input source feature
        '''
        # spatially-variant
        W1 = self.conv1(G) # [B,Cin*K*K,H,W]
        W1 = self.bn1(W1)
        depths = torch.chunk(S, self.sparse_in, 1) # Cin*[B,1,H,W]
        kernels = torch.chunk(W1, self.sparse_in, 1) # Cin*[B,K*K,H,W]
        S1 = []
        for i in range(self.sparse_in):
            S1.append(torch.sum(depths[i]*kernels[i], 1, keepdim=True))
        S1 = torch.cat(S1, 1) # [B,Cin,H,W]

        # cross-channel conv
        W2 = F.adaptive_avg_pool2d(G, (1, 1))
        B = W2.size(0)
        W2 = W2.reshape(B, -1) # (b,img_in)
        W2 = self.fc(W2)
        W2 = W2.view([B, self.sparse_out, self.sparse_in])

        depths = torch.chunk(S1, B, 0) # B*[1,Cin,H,W]
        kernels = torch.chunk(W2, B, 0) # B*[1,Cout,Cin]
        S2 = []
        for i in range(B):
            weight = kernels[i][0].unsqueeze(-1).unsqueeze(-1) # [Cout,Cin,1,1]
            S2.append(F.conv2d(depths[i], weight, bias=None, stride=1, padding=0))
        S2 = torch.cat(S2, 0) # [B,Cin,H,W]
        S2 = F.relu(self.bn2(S2))

        return S2

Could you give me some advice? or just open source this part of code? 🎉

Hi,I want to ask some questions

How effective is your code on the indoor dataset? Can it be used to fill holes left by depth cameras?When will your code be released?

A naive GuidedConv implementation

I guess the source code for this repo would never be released, and since people are confused by "Guided Conv Module" in the paper, I'm going to share my naive PyTorch implementation based on CSPN code.

Note: This implementation was based on my own understanding of the paper, I'm not sure if it's correct. Besides, I got worse result with this module compared to ordinary concat/add fusion method.

Let me know if you have any questions or suggestions.

class _GuidedConv(nn.Module):
    def __init__(self):
        super(_GuidedConv, self).__init__()

        self.pad_left_top = nn.ZeroPad2d((1, 0, 1, 0))
        self.pad_center_top = nn.ZeroPad2d((0, 0, 1, 0))
        self.pad_right_top = nn.ZeroPad2d((0, 1, 1, 0))
        self.pad_left_middle = nn.ZeroPad2d((1, 0, 0, 0))
        self.pad_right_middel = nn.ZeroPad2d((0, 1, 0, 0))
        self.pad_left_bottom = nn.ZeroPad2d((1, 0, 0, 1))
        self.pad_center_bottom = nn.ZeroPad2d((0, 0, 0, 1))
        self.pad_right_bottom = nn.ZeroPad2d((0, 1, 0, 1))

    def forward(self, x, cw: list, cc: list):
        """
        `x`: input feature maps with size `[B, C_in, H, W]`  
        `cw`: `C_in` channel-wise kernels, each with size [B, 3*3, H, W]  
        `cc`: `C_out` cross-channel 1*1 kernels, each with size [B, C_in]
        """

        # stage-1: weight x with kernels in `cw`
        tmp = []
        for i in range(len(cw)):
            feat = self._compose_feat(x[:, i, :, :].unsqueeze_(1))
            feat *= cw[i]
            tmp.append(torch.sum(feat, dim=1, keepdim=True))
        tmp = torch.cat(tmp, dim=1)  # [B, C_in, H, W]

        # stage-2: weight tmp with kernels in `cc`
        out = []
        for i in range(len(cc)):
            weight = cc[i].unsqueeze_(-1).unsqueeze_(-1)
            out.append(torch.sum(tmp * weight, dim=1, keepdim=True))

        return torch.cat(out, dim=1)  # [B, C_out, H, W]

    def _compose_feat(self, feat: torch.FloatTensor):
        [H, W] = feat.shape[2:]
        output = [feat]

        # left-top
        output.append(self.pad_left_top(feat)[:, :, :H, :W])
        # center-top
        output.append(self.pad_center_top(feat)[:, :, :H, :])
        # right-top
        output.append(self.pad_right_top(feat)[:, :, :H, 1:])
        # left-middle
        output.append(self.pad_left_middle(feat)[:, :, :, :W])
        # right-middle
        output.append(self.pad_right_middel(feat)[:, :, :, 1:])
        # left-bottom
        output.append(self.pad_left_bottom(feat)[:, :, 1:, :W])
        # center-bottom
        output.append(self.pad_center_bottom(feat)[:, :, 1:, :])
        # right-bottom
        output.append(self.pad_right_bottom(feat)[:, :, 1:, 1:])
        # concat
        output = torch.cat(output, dim=1)  # [B, 3*3, H, W]

        return output

Dataloader for Virtual KITTI

Hi,
In your paper I noticed that you have done some ablation studies on the Virtual KITTI dataset.
You mentioned that you used the sparse depth masks corresponding to the frame in the original KITTI dataset to generate sparse depth maps for virtual kitti.
Would be great if you can share the code for this matching of frames between KITTI and virtual KITTI.

Thanks,
Hardik

Can someone convert cuda code into pytorch?

Thanks to the author for his good work and open source code!
Since the core code is written in cuda which is difficult to understand for people who are not familiar with cuda programming. Can someone provide the code for pytorch?
Thanks!

Can I reduce kitti_Raw data for training ?

Hi, recently I've been through your great work, I would like to train the network but with less Kitti_Raw data, is that possible? as all know Kitti_raw dataset is very huge, and out of 5 dates of data, I have 3 dates. like 2011_09_26 , 2011_09_28 and 2011_09_30. if so please guide me on how to reduce Kitt-raw data for training. Thanks in advance!

Implementation details (layer and channel numbers)

Thank you for your interesting work and planning to release your code!

I understand that you plan to release the code after getting a review of the TIP submission.
Meanwhile, I'm trying to implement your method from the descriptions in the paper, but some of important information is missing to fully understand the method. Could you help me by answering my questions below?

  • Layer numbers: I suppose the overall network architecture is exactly the one illustrated in Figure 3 in terms of layer numbers, ie, single conv (+bn+relu) layers at input and output, and 5 successive blocks of ResBlock-ResBlock (w/ downsampling in the first ResBlock) in the encoders. Is is right?
  • Places of down-sampling: Where does down-sampling happen? I suppose there is always down-sampling at the first internal layer of each block of two successive ResBlocks. Do the first convolution layers (gray layers) do also down-sampling?
  • Channel numbers: Could you provide the channel number of each layer in Figure 3? It will be very helpful if you could directly write down the numbers beside the layers in the figure and upload it here as an image (You can upload an image here by copying an image data onto clipboard and paste in the text box).

Thank you.

flip in test

Hi,
Thanks for releasing the code.
I find that you feed both original maps and flipped maps into the network to predict the dense depth map in the code. I wonder whether the submitted results are obtained through the same strategy.
Best

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.