Conditionally independent data generation
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:2050-2060, 2021.
Conditional independence (CI) is a fundamental concept with wide applications in machine learning and causal inference. Although the problems of testing CI and estimating divergences have been extensively studied, the complementary problem of generating data that satisfies CI has received much less attention. A special case of the generation problem is to produce conditionally independent predictions. Given samples from an input data distribution, we formulate the problem of generating samples from a distribution that is close to the input distribution and satisfies CI. We establish a characterization of CI in terms of a general divergence identity. Based on one version of this identity, an architecture is proposed that leverages the capabilities of generative adversarial networks (GANs) to enforce CI in an end-to-end differentiable manner. As one illustration of the problem formulation and architecture, we consider applications to notions of fairness that can be written as CIs, specifically equalized odds and conditional statistical parity. We demonstrate conditionally independent prediction that trades off adherence to fairness criteria against classification accuracy.