Cross-Domain 3D Equivariant Image Embeddings

Carlos Esteves, Avneesh Sud, Zhengyi Luo, Kostas Daniilidis, Ameesh Makadia
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:1812-1822, 2019.

Abstract

Spherical convolutional networks have been introduced recently as tools to learn powerful feature representations of 3D shapes. Spherical CNNs are equivariant to 3D rotations making them ideally suited to applications where 3D data may be observed in arbitrary orientations. In this paper we learn 2D image embeddings with a similar equivariant structure: embedding the image of a 3D object should commute with rotations of the object. We introduce a cross-domain embedding from 2D images into a spherical CNN latent space. This embedding encodes images with 3D shape properties and is equivariant to 3D rotations of the observed object. The model is supervised only by target embeddings obtained from a spherical CNN pretrained for 3D shape classification. We show that learning a rich embedding for images with appropriate geometric structure is sufficient for tackling varied applications, such as relative pose estimation and novel view synthesis, without requiring additional task-specific supervision.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-esteves19a, title = {Cross-Domain 3{D} Equivariant Image Embeddings}, author = {Esteves, Carlos and Sud, Avneesh and Luo, Zhengyi and Daniilidis, Kostas and Makadia, Ameesh}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {1812--1822}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/esteves19a/esteves19a.pdf}, url = {https://proceedings.mlr.press/v97/esteves19a.html}, abstract = {Spherical convolutional networks have been introduced recently as tools to learn powerful feature representations of 3D shapes. Spherical CNNs are equivariant to 3D rotations making them ideally suited to applications where 3D data may be observed in arbitrary orientations. In this paper we learn 2D image embeddings with a similar equivariant structure: embedding the image of a 3D object should commute with rotations of the object. We introduce a cross-domain embedding from 2D images into a spherical CNN latent space. This embedding encodes images with 3D shape properties and is equivariant to 3D rotations of the observed object. The model is supervised only by target embeddings obtained from a spherical CNN pretrained for 3D shape classification. We show that learning a rich embedding for images with appropriate geometric structure is sufficient for tackling varied applications, such as relative pose estimation and novel view synthesis, without requiring additional task-specific supervision.} }
Endnote
%0 Conference Paper %T Cross-Domain 3D Equivariant Image Embeddings %A Carlos Esteves %A Avneesh Sud %A Zhengyi Luo %A Kostas Daniilidis %A Ameesh Makadia %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-esteves19a %I PMLR %P 1812--1822 %U https://proceedings.mlr.press/v97/esteves19a.html %V 97 %X Spherical convolutional networks have been introduced recently as tools to learn powerful feature representations of 3D shapes. Spherical CNNs are equivariant to 3D rotations making them ideally suited to applications where 3D data may be observed in arbitrary orientations. In this paper we learn 2D image embeddings with a similar equivariant structure: embedding the image of a 3D object should commute with rotations of the object. We introduce a cross-domain embedding from 2D images into a spherical CNN latent space. This embedding encodes images with 3D shape properties and is equivariant to 3D rotations of the observed object. The model is supervised only by target embeddings obtained from a spherical CNN pretrained for 3D shape classification. We show that learning a rich embedding for images with appropriate geometric structure is sufficient for tackling varied applications, such as relative pose estimation and novel view synthesis, without requiring additional task-specific supervision.
APA
Esteves, C., Sud, A., Luo, Z., Daniilidis, K. & Makadia, A.. (2019). Cross-Domain 3D Equivariant Image Embeddings. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:1812-1822 Available from https://proceedings.mlr.press/v97/esteves19a.html.

Related Material