SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation

Galen O’Shea, Majid Komeili
Proceedings of The 2nd Gaze Meets ML workshop, PMLR 226:197-218, 2024.

Abstract

Gaze estimation is a valuable tool with a broad range of applications in various fields, including medicine, psychology, virtual reality, marketing, and safety. Therefore, it is essential to have gaze estimation software that is cost-efficient and high-performing. Accurately predicting gaze remains a difficult task, particularly in real-world situations where images are affected by motion blur, video compression, and noise. Super-resolution (SR) has been shown to remove these degradations and improve image quality from a visual perspective. This work examines the usefulness of super-resolution for improving appearance-based gaze estimation and demonstrates that not all SR models preserve the gaze direction. We propose a two-step framework for gaze estimation based on the SwinIR super-resolution model. The proposed method consistently outperforms the state-of-the-art, particularly in scenarios involving low-resolution or degraded images. Furthermore, we examine the use of super-resolution through the lens of self-supervised learning for gaze estimation and propose a novel architecture “SuperVision” by fusing an SR backbone network to a ResNet18. While only using 20% of the data, the proposed SuperVision architecture outperforms the state-of-the-art GazeTR method by 15.5%.

Cite this Paper


BibTeX
@InProceedings{pmlr-v226-o-shea24a, title = {SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation}, author = {O'Shea, Galen and Komeili, Majid}, booktitle = {Proceedings of The 2nd Gaze Meets ML workshop}, pages = {197--218}, year = {2024}, editor = {Madu Blessing, Amarachi and Wu, Joy and Zario, Danca and Krupinski, Elizabeth and Kashyap, Satyananda and Karargyris, Alexandros}, volume = {226}, series = {Proceedings of Machine Learning Research}, month = {16 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v226/o-shea24a/o-shea24a.pdf}, url = {https://proceedings.mlr.press/v226/o-shea24a.html}, abstract = {Gaze estimation is a valuable tool with a broad range of applications in various fields, including medicine, psychology, virtual reality, marketing, and safety. Therefore, it is essential to have gaze estimation software that is cost-efficient and high-performing. Accurately predicting gaze remains a difficult task, particularly in real-world situations where images are affected by motion blur, video compression, and noise. Super-resolution (SR) has been shown to remove these degradations and improve image quality from a visual perspective. This work examines the usefulness of super-resolution for improving appearance-based gaze estimation and demonstrates that not all SR models preserve the gaze direction. We propose a two-step framework for gaze estimation based on the SwinIR super-resolution model. The proposed method consistently outperforms the state-of-the-art, particularly in scenarios involving low-resolution or degraded images. Furthermore, we examine the use of super-resolution through the lens of self-supervised learning for gaze estimation and propose a novel architecture “SuperVision” by fusing an SR backbone network to a ResNet18. While only using 20% of the data, the proposed SuperVision architecture outperforms the state-of-the-art GazeTR method by 15.5%.} }
Endnote
%0 Conference Paper %T SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation %A Galen O’Shea %A Majid Komeili %B Proceedings of The 2nd Gaze Meets ML workshop %C Proceedings of Machine Learning Research %D 2024 %E Amarachi Madu Blessing %E Joy Wu %E Danca Zario %E Elizabeth Krupinski %E Satyananda Kashyap %E Alexandros Karargyris %F pmlr-v226-o-shea24a %I PMLR %P 197--218 %U https://proceedings.mlr.press/v226/o-shea24a.html %V 226 %X Gaze estimation is a valuable tool with a broad range of applications in various fields, including medicine, psychology, virtual reality, marketing, and safety. Therefore, it is essential to have gaze estimation software that is cost-efficient and high-performing. Accurately predicting gaze remains a difficult task, particularly in real-world situations where images are affected by motion blur, video compression, and noise. Super-resolution (SR) has been shown to remove these degradations and improve image quality from a visual perspective. This work examines the usefulness of super-resolution for improving appearance-based gaze estimation and demonstrates that not all SR models preserve the gaze direction. We propose a two-step framework for gaze estimation based on the SwinIR super-resolution model. The proposed method consistently outperforms the state-of-the-art, particularly in scenarios involving low-resolution or degraded images. Furthermore, we examine the use of super-resolution through the lens of self-supervised learning for gaze estimation and propose a novel architecture “SuperVision” by fusing an SR backbone network to a ResNet18. While only using 20% of the data, the proposed SuperVision architecture outperforms the state-of-the-art GazeTR method by 15.5%.
APA
O’Shea, G. & Komeili, M.. (2024). SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation. Proceedings of The 2nd Gaze Meets ML workshop, in Proceedings of Machine Learning Research 226:197-218 Available from https://proceedings.mlr.press/v226/o-shea24a.html.

Related Material