Maximum-likelihood learning of cumulative distribution functions on graphs

Jim Huang, Nebojsa Jojic
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:342-349, 2010.

Abstract

For many applications, a probability model can be easily expressed as a cumulative distribution functions (CDF) as compared to the use of probability density or mass functions (PDF/PMFs). Cumulative distribution networks (CDNs) have recently been proposed as a class of graphical models for CDFs. One advantage of CDF models is the simplicity of representing multivariate heavy-tailed distributions. Examples of fields that can benefit from the use of graphical models for CDFs include climatology and epidemiology, where data may follow extreme value statistics and exhibit spatial correlations so that dependencies between model variables must be accounted for. The problem of learning from data in such settings may nevertheless consist of optimizing the log-likelihood function with respect to model parameters where we are required to optimize a log-PDF/PMF and not a log-CDF. We present a message-passing algorithm called the gradient-derivative-product (GDP) algorithm that allows us to learn the model in terms of the log-likelihood function whereby messages correspond to local gradients of the likelihood with respect to model parameters. We will demonstrate the GDP algorithm on real-world rainfall and H1N1 mortality data and we will show that CDNs provide a natural choice of parameterizations for the heavy-tailed multivariate distributions that arise in these problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v9-huang10b, title = {Maximum-likelihood learning of cumulative distribution functions on graphs}, author = {Huang, Jim and Jojic, Nebojsa}, booktitle = {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics}, pages = {342--349}, year = {2010}, editor = {Teh, Yee Whye and Titterington, Mike}, volume = {9}, series = {Proceedings of Machine Learning Research}, address = {Chia Laguna Resort, Sardinia, Italy}, month = {13--15 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v9/huang10b/huang10b.pdf}, url = {https://proceedings.mlr.press/v9/huang10b.html}, abstract = {For many applications, a probability model can be easily expressed as a cumulative distribution functions (CDF) as compared to the use of probability density or mass functions (PDF/PMFs). Cumulative distribution networks (CDNs) have recently been proposed as a class of graphical models for CDFs. One advantage of CDF models is the simplicity of representing multivariate heavy-tailed distributions. Examples of fields that can benefit from the use of graphical models for CDFs include climatology and epidemiology, where data may follow extreme value statistics and exhibit spatial correlations so that dependencies between model variables must be accounted for. The problem of learning from data in such settings may nevertheless consist of optimizing the log-likelihood function with respect to model parameters where we are required to optimize a log-PDF/PMF and not a log-CDF. We present a message-passing algorithm called the gradient-derivative-product (GDP) algorithm that allows us to learn the model in terms of the log-likelihood function whereby messages correspond to local gradients of the likelihood with respect to model parameters. We will demonstrate the GDP algorithm on real-world rainfall and H1N1 mortality data and we will show that CDNs provide a natural choice of parameterizations for the heavy-tailed multivariate distributions that arise in these problems.} }
Endnote
%0 Conference Paper %T Maximum-likelihood learning of cumulative distribution functions on graphs %A Jim Huang %A Nebojsa Jojic %B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2010 %E Yee Whye Teh %E Mike Titterington %F pmlr-v9-huang10b %I PMLR %P 342--349 %U https://proceedings.mlr.press/v9/huang10b.html %V 9 %X For many applications, a probability model can be easily expressed as a cumulative distribution functions (CDF) as compared to the use of probability density or mass functions (PDF/PMFs). Cumulative distribution networks (CDNs) have recently been proposed as a class of graphical models for CDFs. One advantage of CDF models is the simplicity of representing multivariate heavy-tailed distributions. Examples of fields that can benefit from the use of graphical models for CDFs include climatology and epidemiology, where data may follow extreme value statistics and exhibit spatial correlations so that dependencies between model variables must be accounted for. The problem of learning from data in such settings may nevertheless consist of optimizing the log-likelihood function with respect to model parameters where we are required to optimize a log-PDF/PMF and not a log-CDF. We present a message-passing algorithm called the gradient-derivative-product (GDP) algorithm that allows us to learn the model in terms of the log-likelihood function whereby messages correspond to local gradients of the likelihood with respect to model parameters. We will demonstrate the GDP algorithm on real-world rainfall and H1N1 mortality data and we will show that CDNs provide a natural choice of parameterizations for the heavy-tailed multivariate distributions that arise in these problems.
RIS
TY - CPAPER TI - Maximum-likelihood learning of cumulative distribution functions on graphs AU - Jim Huang AU - Nebojsa Jojic BT - Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics DA - 2010/03/31 ED - Yee Whye Teh ED - Mike Titterington ID - pmlr-v9-huang10b PB - PMLR DP - Proceedings of Machine Learning Research VL - 9 SP - 342 EP - 349 L1 - http://proceedings.mlr.press/v9/huang10b/huang10b.pdf UR - https://proceedings.mlr.press/v9/huang10b.html AB - For many applications, a probability model can be easily expressed as a cumulative distribution functions (CDF) as compared to the use of probability density or mass functions (PDF/PMFs). Cumulative distribution networks (CDNs) have recently been proposed as a class of graphical models for CDFs. One advantage of CDF models is the simplicity of representing multivariate heavy-tailed distributions. Examples of fields that can benefit from the use of graphical models for CDFs include climatology and epidemiology, where data may follow extreme value statistics and exhibit spatial correlations so that dependencies between model variables must be accounted for. The problem of learning from data in such settings may nevertheless consist of optimizing the log-likelihood function with respect to model parameters where we are required to optimize a log-PDF/PMF and not a log-CDF. We present a message-passing algorithm called the gradient-derivative-product (GDP) algorithm that allows us to learn the model in terms of the log-likelihood function whereby messages correspond to local gradients of the likelihood with respect to model parameters. We will demonstrate the GDP algorithm on real-world rainfall and H1N1 mortality data and we will show that CDNs provide a natural choice of parameterizations for the heavy-tailed multivariate distributions that arise in these problems. ER -
APA
Huang, J. & Jojic, N.. (2010). Maximum-likelihood learning of cumulative distribution functions on graphs. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 9:342-349 Available from https://proceedings.mlr.press/v9/huang10b.html.

Related Material