TIED: An Artificially Simulated Dataset with Multiple Markov Boundaries

Alexander Statnikov, Constantin F. Aliferis
; Proceedings of Workshop on Causality: Objectives and Assessment at NIPS 2008, PMLR 6:249-256, 2010.

Abstract

We present an artificially simulated dataset (TIED) constructed so that there are many minimal sets of variables with maximal predictivity (i.e., Markov boundaries) and likewise many sets of variables that are statistically indistinguishable from the set of direct causes and direct effects of the response variable. This dataset was used in the Potluck Causality Challenge to determine all statistically indistinguishable sets of direct causes and direct effects and all Markov boundaries of the response variable and also to predict the response variable in the independent test data. We also present baseline results of application of several algorithms to this dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v6-statnikov10a, title = {TIED: An Artificially Simulated Dataset with Multiple Markov Boundaries}, author = {Alexander Statnikov and Constantin F. Aliferis}, pages = {249--256}, year = {2010}, editor = {Isabelle Guyon and Dominik Janzing and Bernhard Schölkopf}, volume = {6}, series = {Proceedings of Machine Learning Research}, address = {Whistler, Canada}, month = {12 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v6/statnikov10a/statnikov10a.pdf}, url = {http://proceedings.mlr.press/v6/statnikov10a.html}, abstract = {We present an artificially simulated dataset (TIED) constructed so that there are many minimal sets of variables with maximal predictivity (i.e., Markov boundaries) and likewise many sets of variables that are statistically indistinguishable from the set of direct causes and direct effects of the response variable. This dataset was used in the Potluck Causality Challenge to determine all statistically indistinguishable sets of direct causes and direct effects and all Markov boundaries of the response variable and also to predict the response variable in the independent test data. We also present baseline results of application of several algorithms to this dataset.} }
Endnote
%0 Conference Paper %T TIED: An Artificially Simulated Dataset with Multiple Markov Boundaries %A Alexander Statnikov %A Constantin F. Aliferis %B Proceedings of Workshop on Causality: Objectives and Assessment at NIPS 2008 %C Proceedings of Machine Learning Research %D 2010 %E Isabelle Guyon %E Dominik Janzing %E Bernhard Schölkopf %F pmlr-v6-statnikov10a %I PMLR %J Proceedings of Machine Learning Research %P 249--256 %U http://proceedings.mlr.press %V 6 %W PMLR %X We present an artificially simulated dataset (TIED) constructed so that there are many minimal sets of variables with maximal predictivity (i.e., Markov boundaries) and likewise many sets of variables that are statistically indistinguishable from the set of direct causes and direct effects of the response variable. This dataset was used in the Potluck Causality Challenge to determine all statistically indistinguishable sets of direct causes and direct effects and all Markov boundaries of the response variable and also to predict the response variable in the independent test data. We also present baseline results of application of several algorithms to this dataset.
RIS
TY - CPAPER TI - TIED: An Artificially Simulated Dataset with Multiple Markov Boundaries AU - Alexander Statnikov AU - Constantin F. Aliferis BT - Proceedings of Workshop on Causality: Objectives and Assessment at NIPS 2008 PY - 2010/02/18 DA - 2010/02/18 ED - Isabelle Guyon ED - Dominik Janzing ED - Bernhard Schölkopf ID - pmlr-v6-statnikov10a PB - PMLR SP - 249 DP - PMLR EP - 256 L1 - http://proceedings.mlr.press/v6/statnikov10a/statnikov10a.pdf UR - http://proceedings.mlr.press/v6/statnikov10a.html AB - We present an artificially simulated dataset (TIED) constructed so that there are many minimal sets of variables with maximal predictivity (i.e., Markov boundaries) and likewise many sets of variables that are statistically indistinguishable from the set of direct causes and direct effects of the response variable. This dataset was used in the Potluck Causality Challenge to determine all statistically indistinguishable sets of direct causes and direct effects and all Markov boundaries of the response variable and also to predict the response variable in the independent test data. We also present baseline results of application of several algorithms to this dataset. ER -
APA
Statnikov, A. & Aliferis, C.F.. (2010). TIED: An Artificially Simulated Dataset with Multiple Markov Boundaries. Proceedings of Workshop on Causality: Objectives and Assessment at NIPS 2008, in PMLR 6:249-256

Related Material