Keypoints-aware Object Detection

Ayush Jaiswal, Simranjit Singh, Yue Wu, Pradeep Natarajan, Premkumar Natarajan
NeurIPS 2020 Workshop on Pre-registration in Machine Learning, PMLR 148:62-72, 2021.

Abstract

We propose a new framework for object detection that guides the model to explicitly reason about translation and rotation invariant object keypoints to boost model robustness. The model first predicts keypoints for each object in the image and then derives bounding-box predictions from the keypoints. While object classification and box regression are supervised, keypoints are learned through self-supervision by comparing keypoints predicted for each image with those for its affine transformations. Thus, the framework does not require additional annotations and can be trained on standard object detection datasets. The proposed model is designed to be anchor-free, proposal-free, and single-stage in order to avoid associated computational overhead and hyperparameter tuning. Furthermore, the generated keypoints allow for inferring close-fit rotated bounding boxes and coarse segmentation for free. Results of our model on VOC show promising results. Our findings regarding training difficulties and pitfalls pave the way for future research in this direction.

Cite this Paper


BibTeX
@InProceedings{pmlr-v148-jaiswal21a, title = {Keypoints-aware Object Detection}, author = {Jaiswal, Ayush and Singh, Simranjit and Wu, Yue and Natarajan, Pradeep and Natarajan, Premkumar}, booktitle = {NeurIPS 2020 Workshop on Pre-registration in Machine Learning}, pages = {62--72}, year = {2021}, editor = {Bertinetto, Luca and Henriques, João F. and Albanie, Samuel and Paganini, Michela and Varol, Gül}, volume = {148}, series = {Proceedings of Machine Learning Research}, month = {11 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v148/jaiswal21a/jaiswal21a.pdf}, url = {https://proceedings.mlr.press/v148/jaiswal21a.html}, abstract = {We propose a new framework for object detection that guides the model to explicitly reason about translation and rotation invariant object keypoints to boost model robustness. The model first predicts keypoints for each object in the image and then derives bounding-box predictions from the keypoints. While object classification and box regression are supervised, keypoints are learned through self-supervision by comparing keypoints predicted for each image with those for its affine transformations. Thus, the framework does not require additional annotations and can be trained on standard object detection datasets. The proposed model is designed to be anchor-free, proposal-free, and single-stage in order to avoid associated computational overhead and hyperparameter tuning. Furthermore, the generated keypoints allow for inferring close-fit rotated bounding boxes and coarse segmentation for free. Results of our model on VOC show promising results. Our findings regarding training difficulties and pitfalls pave the way for future research in this direction.} }
Endnote
%0 Conference Paper %T Keypoints-aware Object Detection %A Ayush Jaiswal %A Simranjit Singh %A Yue Wu %A Pradeep Natarajan %A Premkumar Natarajan %B NeurIPS 2020 Workshop on Pre-registration in Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Luca Bertinetto %E João F. Henriques %E Samuel Albanie %E Michela Paganini %E Gül Varol %F pmlr-v148-jaiswal21a %I PMLR %P 62--72 %U https://proceedings.mlr.press/v148/jaiswal21a.html %V 148 %X We propose a new framework for object detection that guides the model to explicitly reason about translation and rotation invariant object keypoints to boost model robustness. The model first predicts keypoints for each object in the image and then derives bounding-box predictions from the keypoints. While object classification and box regression are supervised, keypoints are learned through self-supervision by comparing keypoints predicted for each image with those for its affine transformations. Thus, the framework does not require additional annotations and can be trained on standard object detection datasets. The proposed model is designed to be anchor-free, proposal-free, and single-stage in order to avoid associated computational overhead and hyperparameter tuning. Furthermore, the generated keypoints allow for inferring close-fit rotated bounding boxes and coarse segmentation for free. Results of our model on VOC show promising results. Our findings regarding training difficulties and pitfalls pave the way for future research in this direction.
APA
Jaiswal, A., Singh, S., Wu, Y., Natarajan, P. & Natarajan, P.. (2021). Keypoints-aware Object Detection. NeurIPS 2020 Workshop on Pre-registration in Machine Learning, in Proceedings of Machine Learning Research 148:62-72 Available from https://proceedings.mlr.press/v148/jaiswal21a.html.

Related Material