The Sum-Product Theorem: A Foundation for Learning Tractable Models

Abram Friesen; Pedro Domingos

The Sum-Product Theorem: A Foundation for Learning Tractable Models

Abram Friesen, Pedro Domingos

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1909-1918, 2016.

Abstract

Inference in expressive probabilistic models is generally intractable, which makes them difficult to learn and limits their applicability. Sum-product networks are a class of deep models where, surprisingly, inference remains tractable even when an arbitrary number of hidden layers are present. In this paper, we generalize this result to a much broader set of learning problems: all those where inference consists of summing a function over a semiring. This includes satisfiability, constraint satisfaction, optimization, integration, and others. In any semiring, for summation to be tractable it suffices that the factors of every product have disjoint scopes. This unifies and extends many previous results in the literature. Enforcing this condition at learning time thus ensures that the learned models are tractable. We illustrate the power and generality of this approach by applying it to a new type of structured prediction problem: learning a nonconvex function that can be globally optimized in polynomial time. We show empirically that this greatly outperforms the standard approach of learning without regard to the cost of optimization.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-friesen16,
  title = 	 {The Sum-Product Theorem: A Foundation for Learning Tractable Models},
  author = 	 {Friesen, Abram and Domingos, Pedro},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1909--1918},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/friesen16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/friesen16.html},
  abstract = 	 {Inference in expressive probabilistic models is generally intractable, which makes them difficult to learn and limits their applicability. Sum-product networks are a class of deep models where, surprisingly, inference remains tractable even when an arbitrary number of hidden layers are present. In this paper, we generalize this result to a much broader set of learning problems: all those where inference consists of summing a function over a semiring. This includes satisfiability, constraint satisfaction, optimization, integration, and others. In any semiring, for summation to be tractable it suffices that the factors of every product have disjoint scopes. This unifies and extends many previous results in the literature. Enforcing this condition at learning time thus ensures that the learned models are tractable. We illustrate the power and generality of this approach by applying it to a new type of structured prediction problem: learning a nonconvex function that can be globally optimized in polynomial time. We show empirically that this greatly outperforms the standard approach of learning without regard to the cost of optimization.}
}

Endnote

%0 Conference Paper
%T The Sum-Product Theorem: A Foundation for Learning Tractable Models
%A Abram Friesen
%A Pedro Domingos
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-friesen16
%I PMLR
%P 1909--1918
%U https://proceedings.mlr.press/v48/friesen16.html
%V 48
%X Inference in expressive probabilistic models is generally intractable, which makes them difficult to learn and limits their applicability. Sum-product networks are a class of deep models where, surprisingly, inference remains tractable even when an arbitrary number of hidden layers are present. In this paper, we generalize this result to a much broader set of learning problems: all those where inference consists of summing a function over a semiring. This includes satisfiability, constraint satisfaction, optimization, integration, and others. In any semiring, for summation to be tractable it suffices that the factors of every product have disjoint scopes. This unifies and extends many previous results in the literature. Enforcing this condition at learning time thus ensures that the learned models are tractable. We illustrate the power and generality of this approach by applying it to a new type of structured prediction problem: learning a nonconvex function that can be globally optimized in polynomial time. We show empirically that this greatly outperforms the standard approach of learning without regard to the cost of optimization.

RIS


TY  - CPAPER
TI  - The Sum-Product Theorem: A Foundation for Learning Tractable Models
AU  - Abram Friesen
AU  - Pedro Domingos
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-friesen16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1909
EP  - 1918
L1  - http://proceedings.mlr.press/v48/friesen16.pdf
UR  - https://proceedings.mlr.press/v48/friesen16.html
AB  - Inference in expressive probabilistic models is generally intractable, which makes them difficult to learn and limits their applicability. Sum-product networks are a class of deep models where, surprisingly, inference remains tractable even when an arbitrary number of hidden layers are present. In this paper, we generalize this result to a much broader set of learning problems: all those where inference consists of summing a function over a semiring. This includes satisfiability, constraint satisfaction, optimization, integration, and others. In any semiring, for summation to be tractable it suffices that the factors of every product have disjoint scopes. This unifies and extends many previous results in the literature. Enforcing this condition at learning time thus ensures that the learned models are tractable. We illustrate the power and generality of this approach by applying it to a new type of structured prediction problem: learning a nonconvex function that can be globally optimized in polynomial time. We show empirically that this greatly outperforms the standard approach of learning without regard to the cost of optimization.
ER  -

APA


Friesen, A. & Domingos, P.. (2016). The Sum-Product Theorem: A Foundation for Learning Tractable Models. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1909-1918 Available from https://proceedings.mlr.press/v48/friesen16.html.

The Sum-Product Theorem: A Foundation for Learning Tractable Models

Abstract

Cite this Paper

Related Material