Comments (5)
Ultimately it is a hyperparameter, similar to the spatial resolution in a planar CNN. In a planar CNN this is purely determined by by the stride of the convolutions and the pooling, but in spherical CNNs you have more flexibility: you can in principle choose the resolution freely in each layer.
There are currently no good "best practices" for spherical CNN architecture design, and this includes the bandwidth/resolution, but there are a couple of considerations that would factor into the decision:
- Higher resolution means you can represent more small details
- Higher resolution means higher computational cost
- If you reduce the resolution too quickly, you would ignore units in the input, just like when you use 2D convolution with a stride that is larger than the filter size.
- If the task is classification, you would typically start with a high resolution and gradually decrease it. The final layer can have very low resolution, and each unit has a receptive field that covers the whole input.
- If the task is e.g. segmentation, you could try a U-net like architecture.
We choose bandwidth=30 because it's not too large, but still allows us to represent MNIST digits without losing too much detail. MNIST images are 28x28, but we project them only on the top of the sphere, so using a spherical grid with 2*b=60 samples per dimension, we can represent it fairly accurately.
from s2cnn.
Thank you for your quick reply!
As the paper describes, We created two instances of this dataset: one in which each digit is projected on the northern hemisphere and one in which each projected digit is additionally randomly rotated.
Why do you just project on the northern hemisphere instead of the entire sphere?
Thank you!
from s2cnn.
Also, for VGG Net, there are some nn.MaxPool2d(kernel_size=2, stride=2) layers. Is there any implementations of MaxPool2d() operation or MaxPool3d() operation for Spherical CNN? Probably so3_integrate()
is one possible solution as this issue mentioned. However can we feed the output of so3_integrate
as the input of the next SO3Convolution()? I am afraid the shape is not appropriate. Or can we directly use torch.nn.MaxPool2d / torch.nn.MaxPool3d same as standard 2D CNN?
Or probably we don't have to think about pooling at all since there is no kernel_size
or stride
concepts for s2cnn?
Thank you for your suggestion.
from s2cnn.
We projected onto the northern hemisphere because that way the digit doesn't get stretched too much. It's just a toy experiment so we didn't think about this too much. Projecting it on the whole sphere would most likely work as well.
Max pooling is a bit tricky. You could just do nn.MaxPool2d or 3d on the array that stores the feature map, but due to the inhomogeneous sampling grid, this would not be equivariant. It would probably still be approximately equivariant, and may work in practice.
so3_integrate() does a global average pooling. If you want to do a local average pooling, you could use a convolution with a fixed Gaussian blur filter, and sample the result on a low-resolution (low-bandwidth) grid.
from s2cnn.
Great thanks to your quick reply! Your work is so fascinating!
from s2cnn.
Related Issues (20)
- shrec17 dataset HOT 15
- Cannot run the code in Mac, as there is no CUDA
- some question when I run gendata.py in /examples/mnist folder HOT 4
- query about feature maps HOT 4
- Equivariance error issue HOT 6
- About the signal transform
- SO3_fft_real and SO3_ifft_real do not seem to be inverses of each other? HOT 12
- Some questions about the rotation of kernels HOT 1
- How to choose different grid HOT 2
- Visualizations
- Questions about the computations HOT 2
- Correlation Between Spheres HOT 4
- Running MNIST Example Problems HOT 3
- Error with einsum in Equivariance plot HOT 3
- Error in so3_rotation (Jd matrix size) with custom data
- No module named 'lie_learn.representations.SO3.irrep_bases' HOT 4
- Error running example HOT 4
- Theoretical Problems about SO(3) Fourier Transformation HOT 2
- s2cnn
- How can I specify GPU to run s2cnn?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from s2cnn.