Extracting Invariant Features From Images Using An Equivariant Autoencoder
Proceedings of The 10th Asian Conference on Machine Learning, PMLR 95:438-453, 2018.
Convolutional Neural Networks achieve state of the art results in many image recognition tasks. While their structure makes predictions invariant to small translations, some recognition tasks require invariance to other transformations, like rotation and reflection. We apply group convolutions to build an Equivariant Autoencoder with embeddings that change predictably under the specified set of transformations. We then introduce two approaches to extracting invariant features from these embeddings—Gram Pooling and Equivariant Attention. These two methods separate transformation-relevant information from all other image features. We use obtained embeddings in classification and clustering tasks and show an improvement of the classification quality on the learned embeddings compared to pure autoencoder and average pooling method. A visualization of the learned manifold shows that objects of the same class tend to cluster together, which was not observed for the pure autoencoder embeddings.