End-to-End Learning for Structured Prediction Energy Networks

David Belanger, Bishan Yang, Andrew McCallum
; Proceedings of the 34th International Conference on Machine Learning, PMLR 70:429-439, 2017.

Abstract

Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016). An energy function over candidate structured outputs is given by a deep network, and predictions are formed by gradient-based optimization. This paper presents end-to-end learning for SPENs, where the energy function is discriminatively trained by back-propagating through gradient-based prediction. In our experience, the approach is substantially more accurate than the structured SVM method of Belanger and McCallum (2016), as it allows us to use more sophisticated non-convex energies. We provide a collection of techniques for improving the speed, accuracy, and memory requirements of end-to-end SPENs, and demonstrate the power of our method on 7-Scenes image denoising and CoNLL-2005 semantic role labeling tasks. In both, inexact minimization of non-convex SPEN energies is superior to baseline methods that use simplistic energy functions that can be minimized exactly.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-belanger17a, title = {End-to-End Learning for Structured Prediction Energy Networks}, author = {David Belanger and Bishan Yang and Andrew McCallum}, pages = {429--439}, year = {2017}, editor = {Doina Precup and Yee Whye Teh}, volume = {70}, series = {Proceedings of Machine Learning Research}, address = {International Convention Centre, Sydney, Australia}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/belanger17a/belanger17a.pdf}, url = {http://proceedings.mlr.press/v70/belanger17a.html}, abstract = {Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016). An energy function over candidate structured outputs is given by a deep network, and predictions are formed by gradient-based optimization. This paper presents end-to-end learning for SPENs, where the energy function is discriminatively trained by back-propagating through gradient-based prediction. In our experience, the approach is substantially more accurate than the structured SVM method of Belanger and McCallum (2016), as it allows us to use more sophisticated non-convex energies. We provide a collection of techniques for improving the speed, accuracy, and memory requirements of end-to-end SPENs, and demonstrate the power of our method on 7-Scenes image denoising and CoNLL-2005 semantic role labeling tasks. In both, inexact minimization of non-convex SPEN energies is superior to baseline methods that use simplistic energy functions that can be minimized exactly.} }
Endnote
%0 Conference Paper %T End-to-End Learning for Structured Prediction Energy Networks %A David Belanger %A Bishan Yang %A Andrew McCallum %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-belanger17a %I PMLR %J Proceedings of Machine Learning Research %P 429--439 %U http://proceedings.mlr.press %V 70 %W PMLR %X Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016). An energy function over candidate structured outputs is given by a deep network, and predictions are formed by gradient-based optimization. This paper presents end-to-end learning for SPENs, where the energy function is discriminatively trained by back-propagating through gradient-based prediction. In our experience, the approach is substantially more accurate than the structured SVM method of Belanger and McCallum (2016), as it allows us to use more sophisticated non-convex energies. We provide a collection of techniques for improving the speed, accuracy, and memory requirements of end-to-end SPENs, and demonstrate the power of our method on 7-Scenes image denoising and CoNLL-2005 semantic role labeling tasks. In both, inexact minimization of non-convex SPEN energies is superior to baseline methods that use simplistic energy functions that can be minimized exactly.
APA
Belanger, D., Yang, B. & McCallum, A.. (2017). End-to-End Learning for Structured Prediction Energy Networks. Proceedings of the 34th International Conference on Machine Learning, in PMLR 70:429-439

Related Material