Beta Diffusion Trees

Creighton Heaukulani; David Knowles; Zoubin Ghahramani

Beta Diffusion Trees

Creighton Heaukulani, David Knowles, Zoubin Ghahramani

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1809-1817, 2014.

Abstract

We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. The generative process for the tree is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet and Pitman-Yor diffusion trees (Neal, 2003b; Knowles & Ghahramani, 2011), both of which define tree structures over clusters of the particles. With the beta diffusion tree, however, multiple copies of a particle may exist and diffuse to multiple locations in the continuous space, resulting in (a random number of) possibly overlapping clusters of the objects. We demonstrate how to build a hierarchically-clustered factor analysis model with the beta diffusion tree and how to perform inference over the random tree structures with a Markov chain Monte Carlo algorithm. We conclude with several numerical experiments on missing data problems with data sets of gene expression arrays, international development statistics, and intranational socioeconomic measurements.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-heaukulani14,
  title = 	 {Beta Diffusion Trees},
  author = 	 {Heaukulani, Creighton and Knowles, David and Ghahramani, Zoubin},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {1809--1817},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/heaukulani14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/heaukulani14.html},
  abstract = 	 {We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. The generative process for the tree is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet and Pitman-Yor diffusion trees (Neal, 2003b; Knowles & Ghahramani, 2011), both of which define tree structures over clusters of the particles. With the beta diffusion tree, however, multiple copies of a particle may exist and diffuse to multiple locations in the continuous space, resulting in (a random number of) possibly overlapping clusters of the objects. We demonstrate how to build a hierarchically-clustered factor analysis model with the beta diffusion tree and how to perform inference over the random tree structures with a Markov chain Monte Carlo algorithm. We conclude with several numerical experiments on missing data problems with data sets of gene expression arrays, international development statistics, and intranational socioeconomic measurements.}
}

Endnote

%0 Conference Paper
%T Beta Diffusion Trees
%A Creighton Heaukulani
%A David Knowles
%A Zoubin Ghahramani
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-heaukulani14
%I PMLR
%P 1809--1817
%U https://proceedings.mlr.press/v32/heaukulani14.html
%V 32
%N 2
%X We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. The generative process for the tree is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet and Pitman-Yor diffusion trees (Neal, 2003b; Knowles & Ghahramani, 2011), both of which define tree structures over clusters of the particles. With the beta diffusion tree, however, multiple copies of a particle may exist and diffuse to multiple locations in the continuous space, resulting in (a random number of) possibly overlapping clusters of the objects. We demonstrate how to build a hierarchically-clustered factor analysis model with the beta diffusion tree and how to perform inference over the random tree structures with a Markov chain Monte Carlo algorithm. We conclude with several numerical experiments on missing data problems with data sets of gene expression arrays, international development statistics, and intranational socioeconomic measurements.

RIS


TY  - CPAPER
TI  - Beta Diffusion Trees
AU  - Creighton Heaukulani
AU  - David Knowles
AU  - Zoubin Ghahramani
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-heaukulani14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 1809
EP  - 1817
L1  - http://proceedings.mlr.press/v32/heaukulani14.pdf
UR  - https://proceedings.mlr.press/v32/heaukulani14.html
AB  - We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. The generative process for the tree is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet and Pitman-Yor diffusion trees (Neal, 2003b; Knowles & Ghahramani, 2011), both of which define tree structures over clusters of the particles. With the beta diffusion tree, however, multiple copies of a particle may exist and diffuse to multiple locations in the continuous space, resulting in (a random number of) possibly overlapping clusters of the objects. We demonstrate how to build a hierarchically-clustered factor analysis model with the beta diffusion tree and how to perform inference over the random tree structures with a Markov chain Monte Carlo algorithm. We conclude with several numerical experiments on missing data problems with data sets of gene expression arrays, international development statistics, and intranational socioeconomic measurements.
ER  -

APA


Heaukulani, C., Knowles, D. & Ghahramani, Z.. (2014). Beta Diffusion Trees. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1809-1817 Available from https://proceedings.mlr.press/v32/heaukulani14.html.

Related Material

Download PDF