[edit]
Kernel-based Approach for Learning Causal Graphs from Mixed Data
Proceedings of the 10th International Conference on Probabilistic Graphical Models, PMLR 138:221-232, 2020.
Abstract
A causal graph can be generated from a dataset using a particular
causal algorithm, for instance, the PC algorithm or Fast Causal
Inference (FCI). This paper provides two contributions in learning
causal graphs: an easy way to handle mixed data so that it can
be used to learn causal graphs using the PC algorithm/FCI and a
method to evaluate the learned graph structure when the true graph
is unknown. This research proposes using kernel functions and Kernel
Alignment to handle mixed data. The two main steps of this approach are
computing a kernel matrix for each variable and calculating a
pseudo-correlation matrix using Kernel Alignment. The Kernel
Alignment matrix is used as a substitute for the correlation matrix
that is the main component used in computing a partial correlation
for the conditional independence test for Gaussian data in the PC
Algorithm and FCI. The advantage of this idea is that is possible to
handle more data types when there is a suitable kernel function to
compute a kernel matrix for an observed variable. The proposed
method is successfully applied to learn a causal graph from mixed
data containing categorical, binary, ordinal, and continuous
variables. We also introduce the Modal Value of Edges Existence
(MVEE) method, a new method to evaluate the structure of learned
graphs represented by Partial Ancestral Graph (PAG) when the true
graph is unknown. MVEE produces an agreement graph as a proxy to the
true graph to evaluate the structure of the learned graph. MVEE is
successfully used to choose the best-learned graph when the true
graph is unknown.