chunhuanlin / deform_conv_pytorch Goto Github PK
View Code? Open in Web Editor NEWPyTorch Implementation of Deformable Convolution
PyTorch Implementation of Deformable Convolution
作者您好,我这篇代码可能看的不是很明白,有些问题想请教一下,希望您能有时间解答,非常感谢。在代码中有self.offsets = nn.Conv2d(128, 18, kernel_size=3, `padding=1),为什么输入128通道,出来一定是18通道程序才能正常运行呢?我将输出的通道数改成了例如64,就会出现以下问题
ValueError: cannot reshape array of size 18 into shape (1,64,1,1)
看了论文,觉得应该和卷积核有关,但是我更改了卷积核保持输出的特征图大小不变,但问题还是一样的~~~所以特别不明白,我非常喜欢您开源的这个代码,很简洁,正在学习。希望您可以帮我解答疑惑。
In the padding area, the pointer p is calculate as p = p*(1-mask) + floor_p*mask, then on the left side, the bilinear interpolated result is always 0, but on the right side, it's not always 0. This seems a little strange.
You did: N = offset.size(1) // 2, In that level the offset.size(1) is equal to 128 , and N suppuse to be the kernel size *2, means N=|R|
And I haven't change anything in the code...
Thanks.
In code section https://github.com/ChunhuanLin/deform_conv_pytorch/blob/master/deform_conv.py#L113 are you sure multiplying with padded_w is necessary? Doesn't that extend the index beyond x's size?
The padding
in DeformConv2D
doesn't work, because the shape of x_offset
in function _reshape_x_offset
is always h*ks, w*ks
.
train_loader = torch.utils.data.DataLoader(
datasets.MNIST('/home/chlin/data/dataset/MNIST', train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])),
batch_size=args.batch_size, shuffle=True, **kwargs)
change the hard coded file path to ./
hi, will you update the code to Visualize offsets?
First, thank you for your great job.
Then, as told in the paper, if the number of input feature map is N, the offset map should be 2N. But i found in your demo code as below:
`
class DeformNet(nn.Module):
def init(self):
super(DeformNet, self).init()
self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)
self.bn1 = nn.BatchNorm2d(32)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
self.bn2 = nn.BatchNorm2d(64)
self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
self.bn3 = nn.BatchNorm2d(128)
self.offsets = nn.Conv2d(128, 18, kernel_size=3, padding=1)
self.conv4 = DeformConv2D(128, 128, kernel_size=3, padding=1)
self.bn4 = nn.BatchNorm2d(128)
self.classifier = nn.Linear(128, 10)
def forward(self, x):
# convs
x = F.relu(self.conv1(x))
x = self.bn1(x)
x = F.relu(self.conv2(x))
x = self.bn2(x)
x = F.relu(self.conv3(x))
x = self.bn3(x)
# deformable convolution
offsets = self.offsets(x)
x = F.relu(self.conv4(x, offsets))
x = self.bn4(x)
x = F.avg_pool2d(x, kernel_size=28, stride=1).view(x.size(0), -1)
x = self.classifier(x)
return F.log_softmax(x, dim=1)
`
Does it shows that 18 offset maps and 128 input feature maps for deform_conv use ?
And is it right?
Hi,
Thank you for this great work! Any plans to add deformable ROI pooling layer?
Thank you
I am running your code in python with TitanX. After completed training, I got below error. I am not sure that this error comes from GPU issue or not. How can I fix it? Thanks
Train Epoch: 1 [59520/60000 (99%)] Loss: 0.255778
Train Epoch: 1 [59840/60000 (100%)] Loss: 0.313443
Spends 9.129980778694152s for each training epoch
demo.py:173: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
data, target = Variable(data, volatile=True), Variable(target)
THCudaCheck FAIL file=/home/john/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "demo.py", line 190, in <module>
test()
File "demo.py", line 174, in test
output = model(data)
File "/home/john/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 371, in __call__
result = self.forward(*input, **kwargs)
File "demo.py", line 82, in forward
x = F.relu(self.conv4(x, offsets))
File "/home/john/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 371, in __call__
result = self.forward(*input, **kwargs)
File "/home/john/deform_conv_pytorch/deform_conv.py", line 59, in forward
x_q_lt = self._get_x_q(x, q_lt, N)
File "/home/john/deform_conv_pytorch/deform_conv.py", line 117, in _get_x_q
x_offset = x.gather(dim=-1, index=index).contiguous().view(b, c, h, w, N)
RuntimeError: cuda runtime error (2) : out of memory at /home/john/pytorch/aten/src/THC/generic/THCStorage.cu:58
Thanks for your sharing.
I don't know why I change the input size like 56x56, It is can meet this question.
I print the index.shape is torch.Size([8, 32, 28224]).
Please help me!!
self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
self.bn2 = nn.BatchNorm2d(64)
self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
self.bn3 = nn.BatchNorm2d(128)
self.offsets = nn.Conv2d(128, 18, kernel_size=3, padding=1)
self.conv4 = **DeformConv2D**(128, 128, kernel_size=3, padding=1)
self.bn4 = nn.BatchNorm2d(128)
self.classifier = nn.Linear(128, 10)
Why are you only using 1 DefConv layer? Why not make all of them deformable? If there is a reason then why the last layer only?
Thanks for sharing the repo. I thought of the deform layer use as plug and play (right?)
Replacing one of the nn.Conv2d()
layers with DeformConv2D()
produced this error:
result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'offset'
How to calculate/pass this parameter with each input?
thanks for your sharing, can you explain the meaning of q_lt, g_lt and x_q_lt ?
thanks~
Thinking about deformable convolutions, some things I found different interpretations of:
-) Does the offset change for each individual k x k kernel or is it fixed for the whole image? Would this mean that pixels could potentially overlap?
-) Is the same offset then applied for each input layer, ie. AxBxC where C might be any number of filters.
-) During inference, keeping the offset generating layers in the network, each k x k kernel would experience an individual offset, or would the offset be the same for the whole image?
I wonder what is the reason of permuting the index of tensor p?
deform_conv_pytorch/deform_conv.py
Line 13 in 5545b18
I think the stride here should be 1.
Can it reach the paper benchmark?
Hi, great job. But I have some problems regarding dilation and stride settings.
If I want to add dilation into your deformable conv, is it right that I only need to change the settings in nn.Conv2D in conv_kernel
and conv_offset
?
Actually the problem is about your stride settings, I do not understand why you set stride to kernel size.
So I'm afraid that adding dilation straight-forwardly would cause problems.
I saw someone else discussing this issue here in #9 , but I donot really get your point : ((
In the line
self.conv_kernel = nn.Conv2d(inc, outc, kernel_size=kernel_size, stride=kernel_size, bias=bias)
why do you set the stride the same as kernel_size? Is it an error?
Hi, Mr Lin, an excellent work you have done! I have a question after I read many versions of code of deformable cnn, including the officical mxnet version. Why does the offset conv have the bias term like https://github.com/msracver/Deformable-ConvNets/blob/2b47f673e965701751109db07571ff61d827632d/deeplab/symbols/resnet_v1_101_deeplab_dcn.py#L679 and its after conv doesn't have the bias term https://github.com/msracver/Deformable-ConvNets/blob/2b47f673e965701751109db07571ff61d827632d/deeplab/symbols/resnet_v1_101_deeplab_dcn.py#L683 ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.