Efficient Mining of Statistical Dependencies

Tim Oates, Matthew D. Schmill, Paul R. Cohen, Casey Durfee
Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, PMLR R2, 1999.

Abstract

The Multi-Stream Dependency Detection algorithm finds rules that capture statistical dependencies between patterns in multivariate time series of categorical data. Rule strength is measured by the G statistic, and an upper bound on the value of G for the descendants of a node allows MSDD’s search space to be pruned. However, in the worst case, the algorithm will explore exponentially many rules. This paper presents and empirically evaluates two ways of addressing this problem. The first is a set of three methods for reducing the size of MSDD’s search space based on information collected during the search process. Second, we discuss an implementation of MSDD that distributes its computations over multiple machines on a network.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR2-oates99a, title = {Efficient Mining of Statistical Dependencies}, author = {Oates, Tim and Schmill, Matthew D. and Cohen, Paul R. and Durfee, Casey}, booktitle = {Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics}, year = {1999}, editor = {Heckerman, David and Whittaker, Joe}, volume = {R2}, series = {Proceedings of Machine Learning Research}, month = {03--06 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r2/oates99a/oates99a.pdf}, url = {https://proceedings.mlr.press/r2/oates99a.html}, abstract = {The Multi-Stream Dependency Detection algorithm finds rules that capture statistical dependencies between patterns in multivariate time series of categorical data. Rule strength is measured by the G statistic, and an upper bound on the value of G for the descendants of a node allows MSDD’s search space to be pruned. However, in the worst case, the algorithm will explore exponentially many rules. This paper presents and empirically evaluates two ways of addressing this problem. The first is a set of three methods for reducing the size of MSDD’s search space based on information collected during the search process. Second, we discuss an implementation of MSDD that distributes its computations over multiple machines on a network.}, note = {Reissued by PMLR on 20 August 2020.} }
Endnote
%0 Conference Paper %T Efficient Mining of Statistical Dependencies %A Tim Oates %A Matthew D. Schmill %A Paul R. Cohen %A Casey Durfee %B Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 1999 %E David Heckerman %E Joe Whittaker %F pmlr-vR2-oates99a %I PMLR %U https://proceedings.mlr.press/r2/oates99a.html %V R2 %X The Multi-Stream Dependency Detection algorithm finds rules that capture statistical dependencies between patterns in multivariate time series of categorical data. Rule strength is measured by the G statistic, and an upper bound on the value of G for the descendants of a node allows MSDD’s search space to be pruned. However, in the worst case, the algorithm will explore exponentially many rules. This paper presents and empirically evaluates two ways of addressing this problem. The first is a set of three methods for reducing the size of MSDD’s search space based on information collected during the search process. Second, we discuss an implementation of MSDD that distributes its computations over multiple machines on a network. %Z Reissued by PMLR on 20 August 2020.
APA
Oates, T., Schmill, M.D., Cohen, P.R. & Durfee, C.. (1999). Efficient Mining of Statistical Dependencies. Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R2 Available from https://proceedings.mlr.press/r2/oates99a.html. Reissued by PMLR on 20 August 2020.

Related Material