A simple criterion for controlling selection bias

Eunice Yuh-Jie Chen, Judea Pearl
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, PMLR 31:170-177, 2013.

Abstract

Controlling selection bias, a statistical error caused by preferential sampling of data, is a fundamental problem in machine learning and statistical inference. This paper presents a simple criterion for controlling selection bias in the odds ratio, a widely used measure for association between variables, that connects the nature of selection bias with the graph modeling the selection mechanism. If the graph contains certain paths, we show that the odds ratio cannot be expressed using data with selection bias. Otherwise, we show that a d-separability test can determine whether the odds ratio can be recovered, and when the answer is affirmative, output an unbiased estimand of the odds ratio. The criterion can be test in linear time and enhances the power of the estimand.

Cite this Paper


BibTeX
@InProceedings{pmlr-v31-chen13b, title = {A simple criterion for controlling selection bias}, author = {Eunice Yuh-Jie Chen and Judea Pearl}, booktitle = {Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics}, pages = {170--177}, year = {2013}, editor = {Carlos M. Carvalho and Pradeep Ravikumar}, volume = {31}, series = {Proceedings of Machine Learning Research}, address = {Scottsdale, Arizona, USA}, month = {29 Apr--01 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v31/chen13b.pdf}, url = {http://proceedings.mlr.press/v31/chen13b.html}, abstract = {Controlling selection bias, a statistical error caused by preferential sampling of data, is a fundamental problem in machine learning and statistical inference. This paper presents a simple criterion for controlling selection bias in the odds ratio, a widely used measure for association between variables, that connects the nature of selection bias with the graph modeling the selection mechanism. If the graph contains certain paths, we show that the odds ratio cannot be expressed using data with selection bias. Otherwise, we show that a d-separability test can determine whether the odds ratio can be recovered, and when the answer is affirmative, output an unbiased estimand of the odds ratio. The criterion can be test in linear time and enhances the power of the estimand.} }
Endnote
%0 Conference Paper %T A simple criterion for controlling selection bias %A Eunice Yuh-Jie Chen %A Judea Pearl %B Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2013 %E Carlos M. Carvalho %E Pradeep Ravikumar %F pmlr-v31-chen13b %I PMLR %P 170--177 %U http://proceedings.mlr.press/v31/chen13b.html %V 31 %X Controlling selection bias, a statistical error caused by preferential sampling of data, is a fundamental problem in machine learning and statistical inference. This paper presents a simple criterion for controlling selection bias in the odds ratio, a widely used measure for association between variables, that connects the nature of selection bias with the graph modeling the selection mechanism. If the graph contains certain paths, we show that the odds ratio cannot be expressed using data with selection bias. Otherwise, we show that a d-separability test can determine whether the odds ratio can be recovered, and when the answer is affirmative, output an unbiased estimand of the odds ratio. The criterion can be test in linear time and enhances the power of the estimand.
RIS
TY - CPAPER TI - A simple criterion for controlling selection bias AU - Eunice Yuh-Jie Chen AU - Judea Pearl BT - Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics DA - 2013/04/29 ED - Carlos M. Carvalho ED - Pradeep Ravikumar ID - pmlr-v31-chen13b PB - PMLR DP - Proceedings of Machine Learning Research VL - 31 SP - 170 EP - 177 L1 - http://proceedings.mlr.press/v31/chen13b.pdf UR - http://proceedings.mlr.press/v31/chen13b.html AB - Controlling selection bias, a statistical error caused by preferential sampling of data, is a fundamental problem in machine learning and statistical inference. This paper presents a simple criterion for controlling selection bias in the odds ratio, a widely used measure for association between variables, that connects the nature of selection bias with the graph modeling the selection mechanism. If the graph contains certain paths, we show that the odds ratio cannot be expressed using data with selection bias. Otherwise, we show that a d-separability test can determine whether the odds ratio can be recovered, and when the answer is affirmative, output an unbiased estimand of the odds ratio. The criterion can be test in linear time and enhances the power of the estimand. ER -
APA
Chen, E.Y. & Pearl, J.. (2013). A simple criterion for controlling selection bias. Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 31:170-177 Available from http://proceedings.mlr.press/v31/chen13b.html.

Related Material