OmniNet: A Multi-Modality Neural Network for Robust Remote Respiratory Rate Measurement from Facial Video

Tsai-Ni Lin; An-Sheng Liu; Li-Chen Fu

OmniNet: A Multi-Modality Neural Network for Robust Remote Respiratory Rate Measurement from Facial Video

Tsai-Ni Lin, An-Sheng Liu, Li-Chen Fu

Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:570-593, 2026.

Abstract

Remote respiratory rate (RR) measurement has gained traction in recent studies due to its ability to reduce healthcare professionals’ workload and patient discomfort. Recent studies have targeted this problem through remote photoplethysmography (rPPG) to capture subtle facial color changes. However, this technique is sensitive to lighting and motion variations. To this end, we propose , a multimodal neural network that integrates image data processed through 3D convolutional neural networks (3D CNNs) with point of interest (POI) motion data and passes the fused features to Bidirectional Long Short-Term Memory (BiLSTM) to model long-term temporal dependencies. achieves state-of-the-art performance by effectively capturing comprehensive spatial and temporal information while reducing illumination variation and motion-induced artifacts. It also requires fewer computational resources and enables faster inference compared to Transformer networks.

Cite this Paper

BibTeX

@InProceedings{pmlr-v315-lin26a,
  title = 	 {OmniNet: A Multi-Modality Neural Network for Robust Remote Respiratory Rate Measurement from Facial Video},
  author =       {Lin, Tsai-Ni and Liu, An-Sheng and Fu, Li-Chen},
  booktitle = 	 {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning},
  pages = 	 {570--593},
  year = 	 {2026},
  editor = 	 {Huo, Yuankai and Gao, Mingchen and Kuo, Chang-Fu and Jin, Yueming and Deng, Ruining},
  volume = 	 {315},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {08--10 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v315/main/assets/lin26a/lin26a.pdf},
  url = 	 {https://proceedings.mlr.press/v315/lin26a.html},
  abstract = 	 {Remote respiratory rate (RR) measurement has gained traction in recent studies due to its ability to reduce healthcare professionals’ workload and patient discomfort. Recent studies have targeted this problem through remote photoplethysmography (rPPG) to capture subtle facial color changes. However, this technique is sensitive to lighting and motion variations. To this end, we propose , a multimodal neural network that integrates image data processed through 3D convolutional neural networks (3D CNNs) with point of interest (POI) motion data and passes the fused features to Bidirectional Long Short-Term Memory (BiLSTM) to model long-term temporal dependencies. achieves state-of-the-art performance by effectively capturing comprehensive spatial and temporal information while reducing illumination variation and motion-induced artifacts. It also requires fewer computational resources and enables faster inference compared to Transformer networks.}
}

Endnote

%0 Conference Paper
%T OmniNet: A Multi-Modality Neural Network for Robust Remote Respiratory Rate Measurement from Facial Video
%A Tsai-Ni Lin
%A An-Sheng Liu
%A Li-Chen Fu
%B Proceedings of The 9th International Conference on Medical Imaging with Deep Learning
%C Proceedings of Machine Learning Research
%D 2026
%E Yuankai Huo
%E Mingchen Gao
%E Chang-Fu Kuo
%E Yueming Jin
%E Ruining Deng	
%F pmlr-v315-lin26a
%I PMLR
%P 570--593
%U https://proceedings.mlr.press/v315/lin26a.html
%V 315
%X Remote respiratory rate (RR) measurement has gained traction in recent studies due to its ability to reduce healthcare professionals’ workload and patient discomfort. Recent studies have targeted this problem through remote photoplethysmography (rPPG) to capture subtle facial color changes. However, this technique is sensitive to lighting and motion variations. To this end, we propose , a multimodal neural network that integrates image data processed through 3D convolutional neural networks (3D CNNs) with point of interest (POI) motion data and passes the fused features to Bidirectional Long Short-Term Memory (BiLSTM) to model long-term temporal dependencies. achieves state-of-the-art performance by effectively capturing comprehensive spatial and temporal information while reducing illumination variation and motion-induced artifacts. It also requires fewer computational resources and enables faster inference compared to Transformer networks.

APA

Lin, T., Liu, A. & Fu, L.. (2026). OmniNet: A Multi-Modality Neural Network for Robust Remote Respiratory Rate Measurement from Facial Video. Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 315:570-593 Available from https://proceedings.mlr.press/v315/lin26a.html.

Related Material

Download PDF