Benchmarking Observational Studies with Experimental Data under Right-Censoring

Ilker Demirel, Edward De Brouwer, Zeshan M Hussain, Michael Oberst, Anthony A Philippakis, David Sontag
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4285-4293, 2024.

Abstract

Drawing causal inferences from observational studies (OS) requires unverifiable validity assumptions; however, one can falsify those assumptions by benchmarking the OS with experimental data from a randomized controlled trial (RCT). A major limitation of existing procedures is not accounting for censoring, despite the abundance of RCTs and OSes that report right-censored time-to-event outcomes. We consider two cases where censoring time (1) is independent of time-to-event and (2) depends on time-to-event the same way in OS and RCT. For the former, we adopt a censoring-doubly-robust signal for the conditional average treatment effect (CATE) to facilitate an equivalence test of CATEs in OS and RCT, which serves as a proxy for testing if the validity assumptions hold. For the latter, we show that the same test can still be used even though unbiased CATE estimation may not be possible. We verify the effectiveness of our censoring-aware tests via semi-synthetic experiments and analyze RCT and OS data from the Women’s Health Initiative study.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-demirel24a, title = { Benchmarking Observational Studies with Experimental Data under Right-Censoring }, author = {Demirel, Ilker and De Brouwer, Edward and M Hussain, Zeshan and Oberst, Michael and A Philippakis, Anthony and Sontag, David}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {4285--4293}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/demirel24a/demirel24a.pdf}, url = {https://proceedings.mlr.press/v238/demirel24a.html}, abstract = { Drawing causal inferences from observational studies (OS) requires unverifiable validity assumptions; however, one can falsify those assumptions by benchmarking the OS with experimental data from a randomized controlled trial (RCT). A major limitation of existing procedures is not accounting for censoring, despite the abundance of RCTs and OSes that report right-censored time-to-event outcomes. We consider two cases where censoring time (1) is independent of time-to-event and (2) depends on time-to-event the same way in OS and RCT. For the former, we adopt a censoring-doubly-robust signal for the conditional average treatment effect (CATE) to facilitate an equivalence test of CATEs in OS and RCT, which serves as a proxy for testing if the validity assumptions hold. For the latter, we show that the same test can still be used even though unbiased CATE estimation may not be possible. We verify the effectiveness of our censoring-aware tests via semi-synthetic experiments and analyze RCT and OS data from the Women’s Health Initiative study. } }
Endnote
%0 Conference Paper %T Benchmarking Observational Studies with Experimental Data under Right-Censoring %A Ilker Demirel %A Edward De Brouwer %A Zeshan M Hussain %A Michael Oberst %A Anthony A Philippakis %A David Sontag %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-demirel24a %I PMLR %P 4285--4293 %U https://proceedings.mlr.press/v238/demirel24a.html %V 238 %X Drawing causal inferences from observational studies (OS) requires unverifiable validity assumptions; however, one can falsify those assumptions by benchmarking the OS with experimental data from a randomized controlled trial (RCT). A major limitation of existing procedures is not accounting for censoring, despite the abundance of RCTs and OSes that report right-censored time-to-event outcomes. We consider two cases where censoring time (1) is independent of time-to-event and (2) depends on time-to-event the same way in OS and RCT. For the former, we adopt a censoring-doubly-robust signal for the conditional average treatment effect (CATE) to facilitate an equivalence test of CATEs in OS and RCT, which serves as a proxy for testing if the validity assumptions hold. For the latter, we show that the same test can still be used even though unbiased CATE estimation may not be possible. We verify the effectiveness of our censoring-aware tests via semi-synthetic experiments and analyze RCT and OS data from the Women’s Health Initiative study.
APA
Demirel, I., De Brouwer, E., M Hussain, Z., Oberst, M., A Philippakis, A. & Sontag, D.. (2024). Benchmarking Observational Studies with Experimental Data under Right-Censoring . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4285-4293 Available from https://proceedings.mlr.press/v238/demirel24a.html.

Related Material