Training Deep Neural Networks via Direct Loss Minimization

Yang Song, Alexander Schwing,  Richard, Raquel Urtasun
; Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2169-2177, 2016.

Abstract

Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application. In this paper we propose a direct loss minimization approach to train deep neural networks, which provably minimizes the application-specific loss function. This is often non-trivial, since these functions are neither smooth nor decomposable and thus are not amenable to optimization with standard gradient-based methods. We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we develop a novel dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection, especially in the presence of label noise.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-songb16, title = {Training Deep Neural Networks via Direct Loss Minimization}, author = {Yang Song and Alexander Schwing and Richard and Raquel Urtasun}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {2169--2177}, year = {2016}, editor = {Maria Florina Balcan and Kilian Q. Weinberger}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/songb16.pdf}, url = {http://proceedings.mlr.press/v48/songb16.html}, abstract = {Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application. In this paper we propose a direct loss minimization approach to train deep neural networks, which provably minimizes the application-specific loss function. This is often non-trivial, since these functions are neither smooth nor decomposable and thus are not amenable to optimization with standard gradient-based methods. We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we develop a novel dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection, especially in the presence of label noise.} }
Endnote
%0 Conference Paper %T Training Deep Neural Networks via Direct Loss Minimization %A Yang Song %A Alexander Schwing %A Richard %A Raquel Urtasun %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-songb16 %I PMLR %J Proceedings of Machine Learning Research %P 2169--2177 %U http://proceedings.mlr.press %V 48 %W PMLR %X Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application. In this paper we propose a direct loss minimization approach to train deep neural networks, which provably minimizes the application-specific loss function. This is often non-trivial, since these functions are neither smooth nor decomposable and thus are not amenable to optimization with standard gradient-based methods. We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we develop a novel dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection, especially in the presence of label noise.
RIS
TY - CPAPER TI - Training Deep Neural Networks via Direct Loss Minimization AU - Yang Song AU - Alexander Schwing AU - Richard AU - Raquel Urtasun BT - Proceedings of The 33rd International Conference on Machine Learning PY - 2016/06/11 DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-songb16 PB - PMLR SP - 2169 DP - PMLR EP - 2177 L1 - http://proceedings.mlr.press/v48/songb16.pdf UR - http://proceedings.mlr.press/v48/songb16.html AB - Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application. In this paper we propose a direct loss minimization approach to train deep neural networks, which provably minimizes the application-specific loss function. This is often non-trivial, since these functions are neither smooth nor decomposable and thus are not amenable to optimization with standard gradient-based methods. We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we develop a novel dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection, especially in the presence of label noise. ER -
APA
Song, Y., Schwing, A., Richard, & Urtasun, R.. (2016). Training Deep Neural Networks via Direct Loss Minimization. Proceedings of The 33rd International Conference on Machine Learning, in PMLR 48:2169-2177

Related Material