Learning Staged Trees from Incomplete Data

Jack Storror Carter, Manuele Leonelli, Eva Riccomagno, Gherardo Varando
Proceedings of The 12th International Conference on Probabilistic Graphical Models, PMLR 246:231-252, 2024.

Abstract

Staged trees are probabilistic graphical models capable of representing any class of non-symmetric independence via a coloring of their vertices. Several structural learning routines have been defined and implemented to learn staged trees from data, under the frequentist or Bayesian paradigm. They assume a data set has been observed fully and, in practice, observations with missing entries are either dropped or imputed before learning the model. Here, we introduce the first algorithms for staged trees that handle missingness within the learning of the model. To this end, we characterize the likelihood of staged tree models in the presence of missing data and discuss pseudo-likelihoods that approximate it. A structural expectation-maximization algorithm estimating the model directly from the full likelihood is also implemented and evaluated. A computational experiment showcases the performance of the novel learning algorithms, demonstrating that it is feasible to account for different missingness patterns when learning staged trees.

Cite this Paper


BibTeX
@InProceedings{pmlr-v246-carter24a, title = {Learning Staged Trees from Incomplete Data}, author = {Carter, Jack Storror and Leonelli, Manuele and Riccomagno, Eva and Varando, Gherardo}, booktitle = {Proceedings of The 12th International Conference on Probabilistic Graphical Models}, pages = {231--252}, year = {2024}, editor = {Kwisthout, Johan and Renooij, Silja}, volume = {246}, series = {Proceedings of Machine Learning Research}, month = {11--13 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v246/main/assets/carter24a/carter24a.pdf}, url = {https://proceedings.mlr.press/v246/carter24a.html}, abstract = {Staged trees are probabilistic graphical models capable of representing any class of non-symmetric independence via a coloring of their vertices. Several structural learning routines have been defined and implemented to learn staged trees from data, under the frequentist or Bayesian paradigm. They assume a data set has been observed fully and, in practice, observations with missing entries are either dropped or imputed before learning the model. Here, we introduce the first algorithms for staged trees that handle missingness within the learning of the model. To this end, we characterize the likelihood of staged tree models in the presence of missing data and discuss pseudo-likelihoods that approximate it. A structural expectation-maximization algorithm estimating the model directly from the full likelihood is also implemented and evaluated. A computational experiment showcases the performance of the novel learning algorithms, demonstrating that it is feasible to account for different missingness patterns when learning staged trees.} }
Endnote
%0 Conference Paper %T Learning Staged Trees from Incomplete Data %A Jack Storror Carter %A Manuele Leonelli %A Eva Riccomagno %A Gherardo Varando %B Proceedings of The 12th International Conference on Probabilistic Graphical Models %C Proceedings of Machine Learning Research %D 2024 %E Johan Kwisthout %E Silja Renooij %F pmlr-v246-carter24a %I PMLR %P 231--252 %U https://proceedings.mlr.press/v246/carter24a.html %V 246 %X Staged trees are probabilistic graphical models capable of representing any class of non-symmetric independence via a coloring of their vertices. Several structural learning routines have been defined and implemented to learn staged trees from data, under the frequentist or Bayesian paradigm. They assume a data set has been observed fully and, in practice, observations with missing entries are either dropped or imputed before learning the model. Here, we introduce the first algorithms for staged trees that handle missingness within the learning of the model. To this end, we characterize the likelihood of staged tree models in the presence of missing data and discuss pseudo-likelihoods that approximate it. A structural expectation-maximization algorithm estimating the model directly from the full likelihood is also implemented and evaluated. A computational experiment showcases the performance of the novel learning algorithms, demonstrating that it is feasible to account for different missingness patterns when learning staged trees.
APA
Carter, J.S., Leonelli, M., Riccomagno, E. & Varando, G.. (2024). Learning Staged Trees from Incomplete Data. Proceedings of The 12th International Conference on Probabilistic Graphical Models, in Proceedings of Machine Learning Research 246:231-252 Available from https://proceedings.mlr.press/v246/carter24a.html.

Related Material