SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation

Galen O’Shea; Majid Komeili

SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation

Galen O’Shea, Majid Komeili

Proceedings of The 2nd Gaze Meets ML workshop, PMLR 226:197-218, 2024.

Abstract

Gaze estimation is a valuable tool with a broad range of applications in various fields, including medicine, psychology, virtual reality, marketing, and safety. Therefore, it is essential to have gaze estimation software that is cost-efficient and high-performing. Accurately predicting gaze remains a difficult task, particularly in real-world situations where images are affected by motion blur, video compression, and noise. Super-resolution (SR) has been shown to remove these degradations and improve image quality from a visual perspective. This work examines the usefulness of super-resolution for improving appearance-based gaze estimation and demonstrates that not all SR models preserve the gaze direction. We propose a two-step framework for gaze estimation based on the SwinIR super-resolution model. The proposed method consistently outperforms the state-of-the-art, particularly in scenarios involving low-resolution or degraded images. Furthermore, we examine the use of super-resolution through the lens of self-supervised learning for gaze estimation and propose a novel architecture “SuperVision” by fusing an SR backbone network to a ResNet18. While only using 20% of the data, the proposed SuperVision architecture outperforms the state-of-the-art GazeTR method by 15.5%.

Cite this Paper

BibTeX


@InProceedings{pmlr-v226-o-shea24a,
  title = 	 {SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation},
  author =       {O'Shea, Galen and Komeili, Majid},
  booktitle = 	 {Proceedings of The 2nd Gaze Meets ML workshop},
  pages = 	 {197--218},
  year = 	 {2024},
  editor = 	 {Madu Blessing, Amarachi and Wu, Joy and Zanca, Dario and Krupinski, Elizabeth and Kashyap, Satyananda and Karargyris, Alexandros},
  volume = 	 {226},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v226/o-shea24a/o-shea24a.pdf},
  url = 	 {https://proceedings.mlr.press/v226/o-shea24a.html},
  abstract = 	 {Gaze estimation is a valuable tool with a broad range of applications in various fields, including medicine, psychology, virtual reality, marketing, and safety. Therefore, it is essential to have gaze estimation software that is cost-efficient and high-performing. Accurately predicting gaze remains a difficult task, particularly in real-world situations where images are affected by motion blur, video compression, and noise. Super-resolution (SR) has been shown to remove these degradations and improve image quality from a visual perspective. This work examines the usefulness of super-resolution for improving appearance-based gaze estimation and demonstrates that not all SR models preserve the gaze direction. We propose a two-step framework for gaze estimation based on the SwinIR super-resolution model. The proposed method consistently outperforms the state-of-the-art, particularly in scenarios involving low-resolution or degraded images. Furthermore, we examine the use of super-resolution through the lens of self-supervised learning for gaze estimation and propose a novel architecture “SuperVision” by fusing an SR backbone network to a ResNet18. While only using 20% of the data, the proposed SuperVision architecture outperforms the state-of-the-art GazeTR method by 15.5%.}
}

Endnote

%0 Conference Paper
%T SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation
%A Galen O’Shea
%A Majid Komeili
%B Proceedings of The 2nd Gaze Meets ML workshop
%C Proceedings of Machine Learning Research
%D 2024
%E Amarachi Madu Blessing
%E Joy Wu
%E Dario Zanca
%E Elizabeth Krupinski
%E Satyananda Kashyap
%E Alexandros Karargyris	
%F pmlr-v226-o-shea24a
%I PMLR
%P 197--218
%U https://proceedings.mlr.press/v226/o-shea24a.html
%V 226
%X Gaze estimation is a valuable tool with a broad range of applications in various fields, including medicine, psychology, virtual reality, marketing, and safety. Therefore, it is essential to have gaze estimation software that is cost-efficient and high-performing. Accurately predicting gaze remains a difficult task, particularly in real-world situations where images are affected by motion blur, video compression, and noise. Super-resolution (SR) has been shown to remove these degradations and improve image quality from a visual perspective. This work examines the usefulness of super-resolution for improving appearance-based gaze estimation and demonstrates that not all SR models preserve the gaze direction. We propose a two-step framework for gaze estimation based on the SwinIR super-resolution model. The proposed method consistently outperforms the state-of-the-art, particularly in scenarios involving low-resolution or degraded images. Furthermore, we examine the use of super-resolution through the lens of self-supervised learning for gaze estimation and propose a novel architecture “SuperVision” by fusing an SR backbone network to a ResNet18. While only using 20% of the data, the proposed SuperVision architecture outperforms the state-of-the-art GazeTR method by 15.5%.

APA


O’Shea, G. & Komeili, M.. (2024). SuperVision: Self-Supervised Super-Resolution for Appearance-Based Gaze Estimation. Proceedings of The 2nd Gaze Meets ML workshop, in Proceedings of Machine Learning Research 226:197-218 Available from https://proceedings.mlr.press/v226/o-shea24a.html.

Related Material

Download PDF