Conditionally independent data generation

Kartik Ahuja, Prasanna Sattigeri, Karthikeyan Shanmugam, Dennis Wei, Karthikeyan Natesan Ramamurthy, Murat Kocaoglu
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:2050-2060, 2021.

Abstract

Conditional independence (CI) is a fundamental concept with wide applications in machine learning and causal inference. Although the problems of testing CI and estimating divergences have been extensively studied, the complementary problem of generating data that satisfies CI has received much less attention. A special case of the generation problem is to produce conditionally independent predictions. Given samples from an input data distribution, we formulate the problem of generating samples from a distribution that is close to the input distribution and satisfies CI. We establish a characterization of CI in terms of a general divergence identity. Based on one version of this identity, an architecture is proposed that leverages the capabilities of generative adversarial networks (GANs) to enforce CI in an end-to-end differentiable manner. As one illustration of the problem formulation and architecture, we consider applications to notions of fairness that can be written as CIs, specifically equalized odds and conditional statistical parity. We demonstrate conditionally independent prediction that trades off adherence to fairness criteria against classification accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v161-ahuja21a, title = {Conditionally independent data generation}, author = {Ahuja, Kartik and Sattigeri, Prasanna and Shanmugam, Karthikeyan and Wei, Dennis and Natesan Ramamurthy, Karthikeyan and Kocaoglu, Murat}, booktitle = {Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence}, pages = {2050--2060}, year = {2021}, editor = {de Campos, Cassio and Maathuis, Marloes H.}, volume = {161}, series = {Proceedings of Machine Learning Research}, month = {27--30 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v161/ahuja21a/ahuja21a.pdf}, url = {https://proceedings.mlr.press/v161/ahuja21a.html}, abstract = {Conditional independence (CI) is a fundamental concept with wide applications in machine learning and causal inference. Although the problems of testing CI and estimating divergences have been extensively studied, the complementary problem of generating data that satisfies CI has received much less attention. A special case of the generation problem is to produce conditionally independent predictions. Given samples from an input data distribution, we formulate the problem of generating samples from a distribution that is close to the input distribution and satisfies CI. We establish a characterization of CI in terms of a general divergence identity. Based on one version of this identity, an architecture is proposed that leverages the capabilities of generative adversarial networks (GANs) to enforce CI in an end-to-end differentiable manner. As one illustration of the problem formulation and architecture, we consider applications to notions of fairness that can be written as CIs, specifically equalized odds and conditional statistical parity. We demonstrate conditionally independent prediction that trades off adherence to fairness criteria against classification accuracy.} }
Endnote
%0 Conference Paper %T Conditionally independent data generation %A Kartik Ahuja %A Prasanna Sattigeri %A Karthikeyan Shanmugam %A Dennis Wei %A Karthikeyan Natesan Ramamurthy %A Murat Kocaoglu %B Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2021 %E Cassio de Campos %E Marloes H. Maathuis %F pmlr-v161-ahuja21a %I PMLR %P 2050--2060 %U https://proceedings.mlr.press/v161/ahuja21a.html %V 161 %X Conditional independence (CI) is a fundamental concept with wide applications in machine learning and causal inference. Although the problems of testing CI and estimating divergences have been extensively studied, the complementary problem of generating data that satisfies CI has received much less attention. A special case of the generation problem is to produce conditionally independent predictions. Given samples from an input data distribution, we formulate the problem of generating samples from a distribution that is close to the input distribution and satisfies CI. We establish a characterization of CI in terms of a general divergence identity. Based on one version of this identity, an architecture is proposed that leverages the capabilities of generative adversarial networks (GANs) to enforce CI in an end-to-end differentiable manner. As one illustration of the problem formulation and architecture, we consider applications to notions of fairness that can be written as CIs, specifically equalized odds and conditional statistical parity. We demonstrate conditionally independent prediction that trades off adherence to fairness criteria against classification accuracy.
APA
Ahuja, K., Sattigeri, P., Shanmugam, K., Wei, D., Natesan Ramamurthy, K. & Kocaoglu, M.. (2021). Conditionally independent data generation. Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 161:2050-2060 Available from https://proceedings.mlr.press/v161/ahuja21a.html.

Related Material