Pain Evaluation in Video using Extended Multitask Learning from Multidimensional Measurements

Xiaojing Xu, Jeannie S Huang, Virginia R De Sa
Proceedings of the Machine Learning for Health NeurIPS Workshop, PMLR 116:141-154, 2020.

Abstract

Previous work on automated pain detection from facial expressions has primarily focused on frame-level pain metrics based on specific facial muscle activations, such as Prkachin and Solomon Pain Intensity (PSPI). However, the current gold standard pain metric is the patient’s self-reported visual analog scale (VAS) level which is a video-level measure. In this work, we propose a multitask multidimensional-pain model to directly predict VAS from video. Our model consists of three stages: (1) a VGGFace neural network model trained to predict frame-level PSPI, where multitask learning is applied, i.e. individual facial action units are predicted together with PSPI, to improve the learning of PSPI; (2) a fully connected neural network to estimate sequence-level pain scores from frame-level PSPI predictions, where again we use multitask learning to learn multidimensional pain scales instead of VAS alone; and (3) an optimal linear combination of the multidimensional pain predictions to obtain a final estimation of VAS. We show on the UNBC-McMaster Shoulder Pain dataset that our multitask multidimensional-pain method achieves state-of-the-art performance with a mean absolute error (MAE) of 1.95 and an intraclass correlation coefficient (ICC) of 0.43. While still not as good as trained human observer predictions provided with the dataset, when we average our estimates with those human estimates, our model improves their MAE from 1.76 to 1.58. Trained on the UNBC-McMaster dataset and applied directly with no further training or fine-tuning on a separate dataset of facial videos recorded during post-appendectomy physical exams, our model also outperforms previous work by 6{%} on the Area under the ROC curve metric (AUC).

Cite this Paper


BibTeX
@InProceedings{pmlr-v116-xu20a, title = {{Pain Evaluation in Video using Extended Multitask Learning from Multidimensional Measurements}}, author = {Xu, Xiaojing and Huang, Jeannie S and {De Sa}, Virginia R}, booktitle = {Proceedings of the Machine Learning for Health NeurIPS Workshop}, pages = {141--154}, year = {2020}, editor = {Dalca, Adrian V. and McDermott, Matthew B.A. and Alsentzer, Emily and Finlayson, Samuel G. and Oberst, Michael and Falck, Fabian and Beaulieu-Jones, Brett}, volume = {116}, series = {Proceedings of Machine Learning Research}, month = {13 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v116/xu20a/xu20a.pdf}, url = {https://proceedings.mlr.press/v116/xu20a.html}, abstract = {Previous work on automated pain detection from facial expressions has primarily focused on frame-level pain metrics based on specific facial muscle activations, such as Prkachin and Solomon Pain Intensity (PSPI). However, the current gold standard pain metric is the patient’s self-reported visual analog scale (VAS) level which is a video-level measure. In this work, we propose a multitask multidimensional-pain model to directly predict VAS from video. Our model consists of three stages: (1) a VGGFace neural network model trained to predict frame-level PSPI, where multitask learning is applied, i.e. individual facial action units are predicted together with PSPI, to improve the learning of PSPI; (2) a fully connected neural network to estimate sequence-level pain scores from frame-level PSPI predictions, where again we use multitask learning to learn multidimensional pain scales instead of VAS alone; and (3) an optimal linear combination of the multidimensional pain predictions to obtain a final estimation of VAS. We show on the UNBC-McMaster Shoulder Pain dataset that our multitask multidimensional-pain method achieves state-of-the-art performance with a mean absolute error (MAE) of 1.95 and an intraclass correlation coefficient (ICC) of 0.43. While still not as good as trained human observer predictions provided with the dataset, when we average our estimates with those human estimates, our model improves their MAE from 1.76 to 1.58. Trained on the UNBC-McMaster dataset and applied directly with no further training or fine-tuning on a separate dataset of facial videos recorded during post-appendectomy physical exams, our model also outperforms previous work by 6{%} on the Area under the ROC curve metric (AUC).} }
Endnote
%0 Conference Paper %T Pain Evaluation in Video using Extended Multitask Learning from Multidimensional Measurements %A Xiaojing Xu %A Jeannie S Huang %A Virginia R De Sa %B Proceedings of the Machine Learning for Health NeurIPS Workshop %C Proceedings of Machine Learning Research %D 2020 %E Adrian V. Dalca %E Matthew B.A. McDermott %E Emily Alsentzer %E Samuel G. Finlayson %E Michael Oberst %E Fabian Falck %E Brett Beaulieu-Jones %F pmlr-v116-xu20a %I PMLR %P 141--154 %U https://proceedings.mlr.press/v116/xu20a.html %V 116 %X Previous work on automated pain detection from facial expressions has primarily focused on frame-level pain metrics based on specific facial muscle activations, such as Prkachin and Solomon Pain Intensity (PSPI). However, the current gold standard pain metric is the patient’s self-reported visual analog scale (VAS) level which is a video-level measure. In this work, we propose a multitask multidimensional-pain model to directly predict VAS from video. Our model consists of three stages: (1) a VGGFace neural network model trained to predict frame-level PSPI, where multitask learning is applied, i.e. individual facial action units are predicted together with PSPI, to improve the learning of PSPI; (2) a fully connected neural network to estimate sequence-level pain scores from frame-level PSPI predictions, where again we use multitask learning to learn multidimensional pain scales instead of VAS alone; and (3) an optimal linear combination of the multidimensional pain predictions to obtain a final estimation of VAS. We show on the UNBC-McMaster Shoulder Pain dataset that our multitask multidimensional-pain method achieves state-of-the-art performance with a mean absolute error (MAE) of 1.95 and an intraclass correlation coefficient (ICC) of 0.43. While still not as good as trained human observer predictions provided with the dataset, when we average our estimates with those human estimates, our model improves their MAE from 1.76 to 1.58. Trained on the UNBC-McMaster dataset and applied directly with no further training or fine-tuning on a separate dataset of facial videos recorded during post-appendectomy physical exams, our model also outperforms previous work by 6{%} on the Area under the ROC curve metric (AUC).
APA
Xu, X., Huang, J.S. & De Sa, V.R.. (2020). Pain Evaluation in Video using Extended Multitask Learning from Multidimensional Measurements. Proceedings of the Machine Learning for Health NeurIPS Workshop, in Proceedings of Machine Learning Research 116:141-154 Available from https://proceedings.mlr.press/v116/xu20a.html.

Related Material