Optimizing Non-decomposable Performance Measures: A Tale of Two Classes

Harikrishna Narasimhan; Purushottam Kar; Prateek Jain

Optimizing Non-decomposable Performance Measures: A Tale of Two Classes

Harikrishna Narasimhan, Purushottam Kar, Prateek Jain

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:199-208, 2015.

Abstract

Modern classification problems frequently present mild to severe label imbalance as well as specific requirements on classification characteristics, and require optimizing performance measures that are non-decomposable over the dataset, such as F-measure. Such measures have spurred much interest and pose specific challenges to learning algorithms since their non-additive nature precludes a direct application of well-studied large scale optimization methods such as stochastic gradient descent. In this paper we reveal that for two large families of performance measures that can be expressed as functions of true positive/negative rates, it is indeed possible to implement point stochastic updates. The families we consider are concave and pseudo-linear functions of TPR, TNR which cover several popularly used performance measures such as F-measure, G-mean and H-mean. Our core contribution is an adaptive linearization scheme for these families, using which we develop optimization techniques that enable truly point-based stochastic updates. For concave performance measures we propose SPADE, a stochastic primal dual solver; for pseudo-linear measures we propose STAMP, a stochastic alternate maximization procedure. Both methods have crisp convergence guarantees, demonstrate significant speedups over existing methods - often by an order of magnitude or more, and give similar or more accurate predictions on test data.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-narasimhana15,
  title = 	 {Optimizing Non-decomposable Performance Measures: A Tale of Two Classes},
  author = 	 {Narasimhan, Harikrishna and Kar, Purushottam and Jain, Prateek},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {199--208},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/narasimhana15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/narasimhana15.html},
  abstract = 	 {Modern classification problems frequently present mild to severe label imbalance as well as specific requirements on classification characteristics, and require optimizing performance measures that are non-decomposable over the dataset, such as F-measure. Such measures have spurred much interest and pose specific challenges to learning algorithms since their non-additive nature precludes a direct application of well-studied large scale optimization methods such as stochastic gradient descent. In this paper we reveal that for two large families of performance measures that can be expressed as functions of true positive/negative rates, it is indeed possible to implement point stochastic updates. The families we consider are concave and pseudo-linear functions of TPR, TNR which cover several popularly used performance measures such as F-measure, G-mean and H-mean. Our core contribution is an adaptive linearization scheme for these families, using which we develop optimization techniques that enable truly point-based stochastic updates. For concave performance measures we propose SPADE, a stochastic primal dual solver; for pseudo-linear measures we propose STAMP, a stochastic alternate maximization procedure. Both methods have crisp convergence guarantees, demonstrate significant speedups over existing methods - often by an order of magnitude or more, and give similar or more accurate predictions on test data.}
}

Endnote

%0 Conference Paper
%T Optimizing Non-decomposable Performance Measures: A Tale of Two Classes
%A Harikrishna Narasimhan
%A Purushottam Kar
%A Prateek Jain
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-narasimhana15
%I PMLR
%P 199--208
%U https://proceedings.mlr.press/v37/narasimhana15.html
%V 37
%X Modern classification problems frequently present mild to severe label imbalance as well as specific requirements on classification characteristics, and require optimizing performance measures that are non-decomposable over the dataset, such as F-measure. Such measures have spurred much interest and pose specific challenges to learning algorithms since their non-additive nature precludes a direct application of well-studied large scale optimization methods such as stochastic gradient descent. In this paper we reveal that for two large families of performance measures that can be expressed as functions of true positive/negative rates, it is indeed possible to implement point stochastic updates. The families we consider are concave and pseudo-linear functions of TPR, TNR which cover several popularly used performance measures such as F-measure, G-mean and H-mean. Our core contribution is an adaptive linearization scheme for these families, using which we develop optimization techniques that enable truly point-based stochastic updates. For concave performance measures we propose SPADE, a stochastic primal dual solver; for pseudo-linear measures we propose STAMP, a stochastic alternate maximization procedure. Both methods have crisp convergence guarantees, demonstrate significant speedups over existing methods - often by an order of magnitude or more, and give similar or more accurate predictions on test data.

RIS


TY  - CPAPER
TI  - Optimizing Non-decomposable Performance Measures: A Tale of Two Classes
AU  - Harikrishna Narasimhan
AU  - Purushottam Kar
AU  - Prateek Jain
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-narasimhana15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 199
EP  - 208
L1  - http://proceedings.mlr.press/v37/narasimhana15.pdf
UR  - https://proceedings.mlr.press/v37/narasimhana15.html
AB  - Modern classification problems frequently present mild to severe label imbalance as well as specific requirements on classification characteristics, and require optimizing performance measures that are non-decomposable over the dataset, such as F-measure. Such measures have spurred much interest and pose specific challenges to learning algorithms since their non-additive nature precludes a direct application of well-studied large scale optimization methods such as stochastic gradient descent. In this paper we reveal that for two large families of performance measures that can be expressed as functions of true positive/negative rates, it is indeed possible to implement point stochastic updates. The families we consider are concave and pseudo-linear functions of TPR, TNR which cover several popularly used performance measures such as F-measure, G-mean and H-mean. Our core contribution is an adaptive linearization scheme for these families, using which we develop optimization techniques that enable truly point-based stochastic updates. For concave performance measures we propose SPADE, a stochastic primal dual solver; for pseudo-linear measures we propose STAMP, a stochastic alternate maximization procedure. Both methods have crisp convergence guarantees, demonstrate significant speedups over existing methods - often by an order of magnitude or more, and give similar or more accurate predictions on test data.
ER  -

APA


Narasimhan, H., Kar, P. & Jain, P.. (2015). Optimizing Non-decomposable Performance Measures: A Tale of Two Classes. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:199-208 Available from https://proceedings.mlr.press/v37/narasimhana15.html.

Optimizing Non-decomposable Performance Measures: A Tale of Two Classes

Abstract

Cite this Paper

Related Material