[edit]
Residual Sum-Product Networks
Proceedings of the 10th International Conference on Probabilistic Graphical Models, PMLR 138:545-556, 2020.
Abstract
Tractable yet expressive density estimators are a key
building block of probabilistic machine learning. While sum-product
networks (SPNs) offer attractive inference capabilities, obtaining
structures large enough to fit complex, high-dimensional data has proven
challenging. In this paper, we present a residual learning approach to
ease the learning of SPNs, which are deeper and wider than those used
previously. The main trick is to ensemble SPNs by explicitly
reformulating sum nodes as residual functions. This adds references to
substructures across the SPNs at different depths, which in turn helps
to improve training. Our experiments demonstrate that the resulting
residual SPNs (ResSPNs) are easy to optimize, gain performance from
considerably increased depth and width, and are competitive to state
of-the-art SPN structure learning approaches. To combat overfitting, we
introduce an iterative pruning technique that compacts models and yields
better generalization.