[edit]
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:65556-65574, 2025.
Abstract
Orientation is a fundamental attribute of objects, essential for understanding their spatial pose and arrangement. However, practical solutions for estimating the orientation of open-world objects in monocular images remain underexplored. In this work, we introduce Orient Anything, the first foundation model for zero-shot object orientation estimation. A key challenge in this task is the scarcity of orientation annotations for open-world objects. To address this, we propose leveraging the vast resources of 3D models. By developing a pipeline to annotate the front face of 3D objects and render them from random viewpoints, we curate 2 million images with precise orientation annotations across a wide variety of object categories. To fully leverage the dataset, we design a robust training objective that models the 3D orientation as probability distributions over three angles and predicts the object orientation by fitting these distributions. Besides, we propose several strategies to further enhance the synthetic-to-real transfer. Our model achieves state-of-the-art orientation estimation accuracy on both rendered and real images, demonstrating impressive zero-shot capabilities across various scenarios. Furthermore, it shows great potential in enhancing high-level applications, such as understanding complex spatial concepts in images and adjusting 3D object pose.