Implicit Training of Inference Network Models for Structured Prediction

Shiv Shankar
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1889-1899, 2023.

Abstract

Most research in deep learning has predominantly focused on the development of new models and training procedures. In contrast, the exploration of training objectives has received considerably less attention, often limited to combinations of standard losses. When dealing with complex structured outputs, the effectiveness of conventional objectives as proxies for the true objective becomes can be questionable. In this study, we propose that existing inference network-based methods for structured prediction, as observed in previous works [Tu and Gimpel, 2018, Tu et al., 2020a], indirectly learn to optimize a dynamic loss objective parameterized by the energy model. Based on this insight, we propose a method that treats the energy network as a trainable loss function and employs an implicit-gradient-based technique to learn the corresponding dynamic objective. We experiment with multiple tasks such as multi-label classification, entity recognition etc., and find significant performance improvements over baseline approaches. Our results demonstrate that implicitly learning a dynamic loss landscape proves to be an effective approach for enhancing model performance in structured prediction tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v216-shankar23a, title = {Implicit Training of Inference Network Models for Structured Prediction}, author = {Shankar, Shiv}, booktitle = {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence}, pages = {1889--1899}, year = {2023}, editor = {Evans, Robin J. and Shpitser, Ilya}, volume = {216}, series = {Proceedings of Machine Learning Research}, month = {31 Jul--04 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v216/shankar23a/shankar23a.pdf}, url = {https://proceedings.mlr.press/v216/shankar23a.html}, abstract = {Most research in deep learning has predominantly focused on the development of new models and training procedures. In contrast, the exploration of training objectives has received considerably less attention, often limited to combinations of standard losses. When dealing with complex structured outputs, the effectiveness of conventional objectives as proxies for the true objective becomes can be questionable. In this study, we propose that existing inference network-based methods for structured prediction, as observed in previous works [Tu and Gimpel, 2018, Tu et al., 2020a], indirectly learn to optimize a dynamic loss objective parameterized by the energy model. Based on this insight, we propose a method that treats the energy network as a trainable loss function and employs an implicit-gradient-based technique to learn the corresponding dynamic objective. We experiment with multiple tasks such as multi-label classification, entity recognition etc., and find significant performance improvements over baseline approaches. Our results demonstrate that implicitly learning a dynamic loss landscape proves to be an effective approach for enhancing model performance in structured prediction tasks.} }
Endnote
%0 Conference Paper %T Implicit Training of Inference Network Models for Structured Prediction %A Shiv Shankar %B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2023 %E Robin J. Evans %E Ilya Shpitser %F pmlr-v216-shankar23a %I PMLR %P 1889--1899 %U https://proceedings.mlr.press/v216/shankar23a.html %V 216 %X Most research in deep learning has predominantly focused on the development of new models and training procedures. In contrast, the exploration of training objectives has received considerably less attention, often limited to combinations of standard losses. When dealing with complex structured outputs, the effectiveness of conventional objectives as proxies for the true objective becomes can be questionable. In this study, we propose that existing inference network-based methods for structured prediction, as observed in previous works [Tu and Gimpel, 2018, Tu et al., 2020a], indirectly learn to optimize a dynamic loss objective parameterized by the energy model. Based on this insight, we propose a method that treats the energy network as a trainable loss function and employs an implicit-gradient-based technique to learn the corresponding dynamic objective. We experiment with multiple tasks such as multi-label classification, entity recognition etc., and find significant performance improvements over baseline approaches. Our results demonstrate that implicitly learning a dynamic loss landscape proves to be an effective approach for enhancing model performance in structured prediction tasks.
APA
Shankar, S.. (2023). Implicit Training of Inference Network Models for Structured Prediction. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:1889-1899 Available from https://proceedings.mlr.press/v216/shankar23a.html.

Related Material