chunhuanlin / deform_conv_pytorch Goto Github PK

View Code? Open in Web Editor NEW

279.0 279.0 52.0 19 KB

PyTorch Implementation of Deformable Convolution

Python 47.98% Jupyter Notebook 52.02%

cnn deformable-convolutional pytorch

deform_conv_pytorch's People

Contributors

Stargazers

Watchers

Forkers

xxradon aagq aparna-b fatterzhang yanwang2014 jasondias9 liu3xing3long clegendbuptsun bkvie smilewsw x2ss josephchenhub guangshengshi meimeiainaonao ericking19 yogsin aachenhang xyishere xzf125244170 viozer guoxuejun huangwenwenlili nmxnql can-song yujiezhong tx512185408 aacrobat 3d-a stefanopini krakenleaf liangliu123 chisyliu littlepigzai czifan xn1997 guidewsp chnxindong channingss lukaka4331 hello-wzy loading376 georgelee85 padfoot231 shanggl123 aristotle-li westbrooks0 sicker2022

deform_conv_pytorch's Issues

ValueError: cannot reshape array of size 18 into shape (1,64,1,1)

作者您好，我这篇代码可能看的不是很明白，有些问题想请教一下，希望您能有时间解答，非常感谢。在代码中有self.offsets = nn.Conv2d(128, 18, kernel_size=3, `padding=1)，为什么输入128通道，出来一定是18通道程序才能正常运行呢？我将输出的通道数改成了例如64，就会出现以下问题
ValueError: cannot reshape array of size 18 into shape (1,64,1,1)
看了论文，觉得应该和卷积核有关，但是我更改了卷积核保持输出的特征图大小不变，但问题还是一样的~~~所以特别不明白，我非常喜欢您开源的这个代码，很简洁，正在学习。希望您可以帮我解答疑惑。

why use mask in the zero padding area？

In the padding area, the pointer p is calculate as p = p*(1-mask) + floor_p*mask, then on the left side, the bilinear interpolated result is always 0, but on the right side, it's not always 0. This seems a little strange.

ValueError: cannot reshape array of size 18 into shape (1,128,1,1)

You did: N = offset.size(1) // 2, In that level the offset.size(1) is equal to 128 , and N suppuse to be the kernel size *2, means N=|R|

And I haven't change anything in the code...

Thanks.

padded offset

In code section https://github.com/ChunhuanLin/deform_conv_pytorch/blob/master/deform_conv.py#L113 are you sure multiplying with padded_w is necessary? Doesn't that extend the index beyond x's size?

report a bug

The padding in DeformConv2D doesn't work, because the shape of x_offset in function _reshape_x_offset is always h*ks, w*ks.

change train loader in demo.py

train_loader = torch.utils.data.DataLoader(
datasets.MNIST('/home/chlin/data/dataset/MNIST', train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])),
batch_size=args.batch_size, shuffle=True, **kwargs)

change the hard coded file path to ./

Visualize offsets

hi, will you update the code to Visualize offsets?

Some uncertainty about the demo

First, thank you for your great job.
Then, as told in the paper, if the number of input feature map is N, the offset map should be 2N. But i found in your demo code as below:
`
class DeformNet(nn.Module):
def init(self):
super(DeformNet, self).init()
self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)
self.bn1 = nn.BatchNorm2d(32)

    self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
    self.bn2 = nn.BatchNorm2d(64)

    self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
    self.bn3 = nn.BatchNorm2d(128)

    self.offsets = nn.Conv2d(128, 18, kernel_size=3, padding=1)
    self.conv4 = DeformConv2D(128, 128, kernel_size=3, padding=1)
    self.bn4 = nn.BatchNorm2d(128)

    self.classifier = nn.Linear(128, 10)

def forward(self, x):
    # convs
    x = F.relu(self.conv1(x))
    x = self.bn1(x)
    x = F.relu(self.conv2(x))
    x = self.bn2(x)
    x = F.relu(self.conv3(x))
    x = self.bn3(x)

    # deformable convolution
    offsets = self.offsets(x)
    x = F.relu(self.conv4(x, offsets))
    x = self.bn4(x)

    x = F.avg_pool2d(x, kernel_size=28, stride=1).view(x.size(0), -1)
    x = self.classifier(x)

    return F.log_softmax(x, dim=1)

`
Does it shows that 18 offset maps and 128 input feature maps for deform_conv use ?
And is it right?

Deformable ROI pooling layer

Hi,

Thank you for this great work! Any plans to add deformable ROI pooling layer?

Thank you

Cuda memory error when perform evalution

I am running your code in python with TitanX. After completed training, I got below error. I am not sure that this error comes from GPU issue or not. How can I fix it? Thanks

Train Epoch: 1 [59520/60000 (99%)]	Loss: 0.255778
Train Epoch: 1 [59840/60000 (100%)]	Loss: 0.313443
Spends 9.129980778694152s for each training epoch
demo.py:173: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  data, target = Variable(data, volatile=True), Variable(target)
THCudaCheck FAIL file=/home/john/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
  File "demo.py", line 190, in <module>
    test()
  File "demo.py", line 174, in test
    output = model(data)
  File "/home/john/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 371, in __call__
    result = self.forward(*input, **kwargs)
  File "demo.py", line 82, in forward
    x = F.relu(self.conv4(x, offsets))
  File "/home/john/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 371, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/john/deform_conv_pytorch/deform_conv.py", line 59, in forward
    x_q_lt = self._get_x_q(x, q_lt, N)
  File "/home/john/deform_conv_pytorch/deform_conv.py", line 117, in _get_x_q
    x_offset = x.gather(dim=-1, index=index).contiguous().view(b, c, h, w, N)
RuntimeError: cuda runtime error (2) : out of memory at /home/john/pytorch/aten/src/THC/generic/THCStorage.cu:58

RuntimeError: Invalid index in gather at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:657

Thanks for your sharing.
I don't know why I change the input size like 56x56, It is can meet this question.

I print the index.shape is torch.Size([8, 32, 28224]).

Please help me!!

Only 1 deformable convolution

 self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
    self.bn2 = nn.BatchNorm2d(64)

    self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
    self.bn3 = nn.BatchNorm2d(128)

    self.offsets = nn.Conv2d(128, 18, kernel_size=3, padding=1)
    self.conv4 = **DeformConv2D**(128, 128, kernel_size=3, padding=1)
    self.bn4 = nn.BatchNorm2d(128)

    self.classifier = nn.Linear(128, 10)

Why are you only using 1 DefConv layer? Why not make all of them deformable? If there is a reason then why the last layer only?

What is the offset value along the input for forward() in DeformConv2D()

Thanks for sharing the repo. I thought of the deform layer use as plug and play (right?)

Replacing one of the nn.Conv2d() layers with DeformConv2D() produced this error:

result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'offset'

How to calculate/pass this parameter with each input?

meaning of q_lt, g_lt and x_q_lt

thanks for your sharing, can you explain the meaning of q_lt, g_lt and x_q_lt ?
thanks~

Different interpretations of def_conv

Thinking about deformable convolutions, some things I found different interpretations of:
-) Does the offset change for each individual k x k kernel or is it fixed for the whole image? Would this mean that pixels could potentially overlap?
-) Is the same offset then applied for each input layer, ie. AxBxC where C might be any number of filters.
-) During inference, keeping the offset generating layers in the network, each k x k kernel would experience an individual offset, or would the offset be the same for the whole image?

Why did you permute p = p.contiguous().permute(0, 2, 3, 1)?

I wonder what is the reason of permuting the index of tensor p?

typo

deform_conv_pytorch/deform_conv.py

Line 13 in 5545b18

    
           self.conv_kernel = nn.Conv2d(inc, outc, kernel_size=kernel_size, stride=kernel_size, bias=bias)

I think the stride here should be 1.

paper benchmark？

Can it reach the paper benchmark？

A problem about dilation and stride settings

Hi, great job. But I have some problems regarding dilation and stride settings.
If I want to add dilation into your deformable conv, is it right that I only need to change the settings in nn.Conv2D in conv_kernel and conv_offset?

Actually the problem is about your stride settings, I do not understand why you set stride to kernel size.
So I'm afraid that adding dilation straight-forwardly would cause problems.
I saw someone else discussing this issue here in #9 , but I donot really get your point : ((

Stride setting error in DeformConv2D?

In the line
self.conv_kernel = nn.Conv2d(inc, outc, kernel_size=kernel_size, stride=kernel_size, bias=bias)
why do you set the stride the same as kernel_size? Is it an error?

bias for offset and deformable conv

Hi, Mr Lin, an excellent work you have done! I have a question after I read many versions of code of deformable cnn, including the officical mxnet version. Why does the offset conv have the bias term like https://github.com/msracver/Deformable-ConvNets/blob/2b47f673e965701751109db07571ff61d827632d/deeplab/symbols/resnet_v1_101_deeplab_dcn.py#L679 and its after conv doesn't have the bias term https://github.com/msracver/Deformable-ConvNets/blob/2b47f673e965701751109db07571ff61d827632d/deeplab/symbols/resnet_v1_101_deeplab_dcn.py#L683 ?