ST(OR)$^2$: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room

Idris Hamoud; Muhammad Abdullah Jamal; Vinkle Srivastav; Didier MUTTER; Nicolas Padoy; Omid Mohareri

ST(OR)$^2$: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room

Idris Hamoud, Muhammad Abdullah Jamal, Vinkle Srivastav, Didier MUTTER, Nicolas Padoy, Omid Mohareri

Medical Imaging with Deep Learning, PMLR 227:1254-1268, 2024.

Abstract

Surgical robotics holds much promise for improving patient safety and clinician experience in the Operating Room (OR). However, it also comes with new challenges, requiring strong team coordination and effective OR management. Automatic detection of surgical activities is a key requirement for developing AI-based intelligent tools to tackle these challenges. The current state-of-the-art surgical activity recognition methods however operate on image-based representations and depend on large-scale labeled datasets whose collection is time-consuming and resource-expensive. This work proposes a new sample-efficient and object-based approach for surgical activity recognition in the OR. Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR. We conduct experiments in a low-data regime study for long video activity recognition. We also benchmark our method against other object-centric approaches on clip-level action classification and show superior performance.

Cite this Paper

BibTeX


@InProceedings{pmlr-v227-hamoud24a,
  title = 	 {ST(OR)$^2$: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room},
  author =       {Hamoud, Idris and Jamal, Muhammad Abdullah and Srivastav, Vinkle and MUTTER, Didier and Padoy, Nicolas and Mohareri, Omid},
  booktitle = 	 {Medical Imaging with Deep Learning},
  pages = 	 {1254--1268},
  year = 	 {2024},
  editor = 	 {Oguz, Ipek and Noble, Jack and Li, Xiaoxiao and Styner, Martin and Baumgartner, Christian and Rusu, Mirabela and Heinmann, Tobias and Kontos, Despina and Landman, Bennett and Dawant, Benoit},
  volume = 	 {227},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--12 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v227/hamoud24a/hamoud24a.pdf},
  url = 	 {https://proceedings.mlr.press/v227/hamoud24a.html},
  abstract = 	 {Surgical robotics holds much promise for improving patient safety and clinician experience in the Operating Room (OR). However, it also comes with new challenges, requiring strong team coordination and effective OR management. Automatic detection of surgical activities is a key requirement for developing AI-based intelligent tools to tackle these challenges. The current state-of-the-art surgical activity recognition methods however operate on image-based representations and depend on large-scale labeled datasets whose collection is time-consuming and resource-expensive. This work proposes a new sample-efficient and object-based approach for surgical activity recognition in the OR. Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR. We conduct experiments in a low-data regime study for long video activity recognition. We also benchmark our method against other object-centric approaches on clip-level action classification and show superior performance.}
}

Endnote

%0 Conference Paper
%T ST(OR)$^2$: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room
%A Idris Hamoud
%A Muhammad Abdullah Jamal
%A Vinkle Srivastav
%A Didier MUTTER
%A Nicolas Padoy
%A Omid Mohareri
%B Medical Imaging with Deep Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ipek Oguz
%E Jack Noble
%E Xiaoxiao Li
%E Martin Styner
%E Christian Baumgartner
%E Mirabela Rusu
%E Tobias Heinmann
%E Despina Kontos
%E Bennett Landman
%E Benoit Dawant	
%F pmlr-v227-hamoud24a
%I PMLR
%P 1254--1268
%U https://proceedings.mlr.press/v227/hamoud24a.html
%V 227
%X Surgical robotics holds much promise for improving patient safety and clinician experience in the Operating Room (OR). However, it also comes with new challenges, requiring strong team coordination and effective OR management. Automatic detection of surgical activities is a key requirement for developing AI-based intelligent tools to tackle these challenges. The current state-of-the-art surgical activity recognition methods however operate on image-based representations and depend on large-scale labeled datasets whose collection is time-consuming and resource-expensive. This work proposes a new sample-efficient and object-based approach for surgical activity recognition in the OR. Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR. We conduct experiments in a low-data regime study for long video activity recognition. We also benchmark our method against other object-centric approaches on clip-level action classification and show superior performance.

APA


Hamoud, I., Jamal, M.A., Srivastav, V., MUTTER, D., Padoy, N. & Mohareri, O.. (2024). ST(OR)$^2$: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room. Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 227:1254-1268 Available from https://proceedings.mlr.press/v227/hamoud24a.html.

Related Material

Download PDF