A simple criterion for controlling selection bias


Eunice Yuh-Jie Chen, Judea Pearl ;
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, PMLR 31:170-177, 2013.


Controlling selection bias, a statistical error caused by preferential sampling of data, is a fundamental problem in machine learning and statistical inference. This paper presents a simple criterion for controlling selection bias in the odds ratio, a widely used measure for association between variables, that connects the nature of selection bias with the graph modeling the selection mechanism. If the graph contains certain paths, we show that the odds ratio cannot be expressed using data with selection bias. Otherwise, we show that a d-separability test can determine whether the odds ratio can be recovered, and when the answer is affirmative, output an unbiased estimand of the odds ratio. The criterion can be test in linear time and enhances the power of the estimand.

Related Material