Denoising Autoencoders for Learning from Noisy Patient-Reported Data

Harry Rubin-Falcone, Joyce M. Lee, Jenna Wiens
Proceedings of the Conference on Health, Inference, and Learning, PMLR 209:393-409, 2023.

Abstract

Healthcare datasets often include patient-reported values, such as mood, symptoms, and meals, which can be subject to varying levels of human error. Improving the accuracy of patient-reported data could help in several downstream tasks, such as remote patient monitoring. In this study, we propose a novel denoising autoencoder (DAE) approach to denoise patient-reported data, drawing inspiration from recent work in computer vision. Our approach is based on the observation that noisy patient-reported data are often collected alongside higher fidelity data collected from wearable sensors. We leverage these auxiliary data to improve the accuracy of the patient-reported data. Our approach combines key ideas from DAEs with co-teaching to iteratively filter and learn from clean patient-reported samples. Applied to the task of recovering carbohydrate values for blood glucose management in diabetes, our approach reduces noise (MSE) in patient-reported carbohydrates from 72$g^2$ (95% CI: 54-93) to 18$g^2$ (13-25), outperforming the best baseline (33$g^2$ (27-43)). Notably, our approach achieves strong performance with only access to patient-reported target values, making it applicable to many settings where ground truth data may be unavailable.

Cite this Paper


BibTeX
@InProceedings{pmlr-v209-rubin-falcone23a, title = {Denoising Autoencoders for Learning from Noisy Patient-Reported Data}, author = {Rubin-Falcone, Harry and Lee, Joyce M. and Wiens, Jenna}, booktitle = {Proceedings of the Conference on Health, Inference, and Learning}, pages = {393--409}, year = {2023}, editor = {Mortazavi, Bobak J. and Sarker, Tasmie and Beam, Andrew and Ho, Joyce C.}, volume = {209}, series = {Proceedings of Machine Learning Research}, month = {22 Jun--24 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v209/rubin-falcone23a/rubin-falcone23a.pdf}, url = {https://proceedings.mlr.press/v209/rubin-falcone23a.html}, abstract = {Healthcare datasets often include patient-reported values, such as mood, symptoms, and meals, which can be subject to varying levels of human error. Improving the accuracy of patient-reported data could help in several downstream tasks, such as remote patient monitoring. In this study, we propose a novel denoising autoencoder (DAE) approach to denoise patient-reported data, drawing inspiration from recent work in computer vision. Our approach is based on the observation that noisy patient-reported data are often collected alongside higher fidelity data collected from wearable sensors. We leverage these auxiliary data to improve the accuracy of the patient-reported data. Our approach combines key ideas from DAEs with co-teaching to iteratively filter and learn from clean patient-reported samples. Applied to the task of recovering carbohydrate values for blood glucose management in diabetes, our approach reduces noise (MSE) in patient-reported carbohydrates from 72$g^2$ (95% CI: 54-93) to 18$g^2$ (13-25), outperforming the best baseline (33$g^2$ (27-43)). Notably, our approach achieves strong performance with only access to patient-reported target values, making it applicable to many settings where ground truth data may be unavailable. } }
Endnote
%0 Conference Paper %T Denoising Autoencoders for Learning from Noisy Patient-Reported Data %A Harry Rubin-Falcone %A Joyce M. Lee %A Jenna Wiens %B Proceedings of the Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2023 %E Bobak J. Mortazavi %E Tasmie Sarker %E Andrew Beam %E Joyce C. Ho %F pmlr-v209-rubin-falcone23a %I PMLR %P 393--409 %U https://proceedings.mlr.press/v209/rubin-falcone23a.html %V 209 %X Healthcare datasets often include patient-reported values, such as mood, symptoms, and meals, which can be subject to varying levels of human error. Improving the accuracy of patient-reported data could help in several downstream tasks, such as remote patient monitoring. In this study, we propose a novel denoising autoencoder (DAE) approach to denoise patient-reported data, drawing inspiration from recent work in computer vision. Our approach is based on the observation that noisy patient-reported data are often collected alongside higher fidelity data collected from wearable sensors. We leverage these auxiliary data to improve the accuracy of the patient-reported data. Our approach combines key ideas from DAEs with co-teaching to iteratively filter and learn from clean patient-reported samples. Applied to the task of recovering carbohydrate values for blood glucose management in diabetes, our approach reduces noise (MSE) in patient-reported carbohydrates from 72$g^2$ (95% CI: 54-93) to 18$g^2$ (13-25), outperforming the best baseline (33$g^2$ (27-43)). Notably, our approach achieves strong performance with only access to patient-reported target values, making it applicable to many settings where ground truth data may be unavailable.
APA
Rubin-Falcone, H., Lee, J.M. & Wiens, J.. (2023). Denoising Autoencoders for Learning from Noisy Patient-Reported Data. Proceedings of the Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 209:393-409 Available from https://proceedings.mlr.press/v209/rubin-falcone23a.html.

Related Material