Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time

Toon Vanderschueren, Alicia Curth, Wouter Verbeke, Mihaela Van Der Schaar
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:34855-34874, 2023.

Abstract

Machine learning (ML) holds great potential for accurately forecasting treatment outcomes over time, which could ultimately enable the adoption of more individualized treatment strategies in many practical applications. However, a significant challenge that has been largely overlooked by the ML literature on this topic is the presence of informative sampling in observational data. When instances are observed irregularly over time, sampling times are typically not random, but rather informative–depending on the instance’s characteristics, past outcomes, and administered treatments. In this work, we formalize informative sampling as a covariate shift problem and show that it can prohibit accurate estimation of treatment outcomes if not properly accounted for. To overcome this challenge, we present a general framework for learning treatment outcomes in the presence of informative sampling using inverse intensity-weighting, and propose a novel method, TESAR-CDE, that instantiates this framework using Neural CDEs. Using a simulation environment based on a clinical use case, we demonstrate the effectiveness of our approach in learning under informative sampling.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-vanderschueren23a, title = {Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time}, author = {Vanderschueren, Toon and Curth, Alicia and Verbeke, Wouter and Van Der Schaar, Mihaela}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {34855--34874}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/vanderschueren23a/vanderschueren23a.pdf}, url = {https://proceedings.mlr.press/v202/vanderschueren23a.html}, abstract = {Machine learning (ML) holds great potential for accurately forecasting treatment outcomes over time, which could ultimately enable the adoption of more individualized treatment strategies in many practical applications. However, a significant challenge that has been largely overlooked by the ML literature on this topic is the presence of informative sampling in observational data. When instances are observed irregularly over time, sampling times are typically not random, but rather informative–depending on the instance’s characteristics, past outcomes, and administered treatments. In this work, we formalize informative sampling as a covariate shift problem and show that it can prohibit accurate estimation of treatment outcomes if not properly accounted for. To overcome this challenge, we present a general framework for learning treatment outcomes in the presence of informative sampling using inverse intensity-weighting, and propose a novel method, TESAR-CDE, that instantiates this framework using Neural CDEs. Using a simulation environment based on a clinical use case, we demonstrate the effectiveness of our approach in learning under informative sampling.} }
Endnote
%0 Conference Paper %T Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time %A Toon Vanderschueren %A Alicia Curth %A Wouter Verbeke %A Mihaela Van Der Schaar %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-vanderschueren23a %I PMLR %P 34855--34874 %U https://proceedings.mlr.press/v202/vanderschueren23a.html %V 202 %X Machine learning (ML) holds great potential for accurately forecasting treatment outcomes over time, which could ultimately enable the adoption of more individualized treatment strategies in many practical applications. However, a significant challenge that has been largely overlooked by the ML literature on this topic is the presence of informative sampling in observational data. When instances are observed irregularly over time, sampling times are typically not random, but rather informative–depending on the instance’s characteristics, past outcomes, and administered treatments. In this work, we formalize informative sampling as a covariate shift problem and show that it can prohibit accurate estimation of treatment outcomes if not properly accounted for. To overcome this challenge, we present a general framework for learning treatment outcomes in the presence of informative sampling using inverse intensity-weighting, and propose a novel method, TESAR-CDE, that instantiates this framework using Neural CDEs. Using a simulation environment based on a clinical use case, we demonstrate the effectiveness of our approach in learning under informative sampling.
APA
Vanderschueren, T., Curth, A., Verbeke, W. & Van Der Schaar, M.. (2023). Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:34855-34874 Available from https://proceedings.mlr.press/v202/vanderschueren23a.html.

Related Material