Spatial Transformer Network (STN) provides attention to a particular region to in an image, by doing transformation to the input image. The code in this repository does Affine transformation to image, but other transformation can be explored.
However, I am trying to use this to learn rotation/scale/translation invariance on the affNist-dataset.
But it seems to me like this implementation does not really learn rotation. When I take a close look at your sample image, I also do not really see any rotation applied.
Is there smth I can do so that the network learns rotation, as the scaling and translating work great?