Optimal Classification with Multivariate Losses

Nagarajan Natarajan; Oluwasanmi Koyejo; Pradeep Ravikumar; Inderjit Dhillon

Optimal Classification with Multivariate Losses

Nagarajan Natarajan, Oluwasanmi Koyejo, Pradeep Ravikumar, Inderjit Dhillon

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1530-1538, 2016.

Abstract

Multivariate loss functions are extensively employed in several prediction tasks arising in Information Retrieval. Often, the goal in the tasks is to minimize expected loss when retrieving relevant items from a presented set of items, where the expectation is with respect to the joint distribution over item sets. Our key result is that for most multivariate losses, the expected loss is provably optimized by sorting the items by the conditional probability of label being positive and then selecting top k items. Such a result was previously known only for the F-measure. Leveraging on the optimality characterization, we give an algorithm for estimating optimal predictions in practice with runtime quadratic in size of item sets for many losses. We provide empirical results on benchmark datasets, comparing the proposed algorithm to state-of-the-art methods for optimizing multivariate losses.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-natarajan16,
  title = 	 {Optimal Classification with Multivariate Losses},
  author = 	 {Natarajan, Nagarajan and Koyejo, Oluwasanmi and Ravikumar, Pradeep and Dhillon, Inderjit},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1530--1538},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/natarajan16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/natarajan16.html},
  abstract = 	 {Multivariate loss functions are extensively employed in several prediction tasks arising in Information Retrieval. Often, the goal in the tasks is to minimize expected loss when retrieving relevant items from a presented set of items, where the expectation is with respect to the joint distribution over item sets. Our key result is that for most multivariate losses, the expected loss is provably optimized by sorting the items by the conditional probability of label being positive and then selecting top k items. Such a result was previously known only for the F-measure. Leveraging on the optimality characterization, we give an algorithm for estimating optimal predictions in practice with runtime quadratic in size of item sets for many losses. We provide empirical results on benchmark datasets, comparing the proposed algorithm to state-of-the-art methods for optimizing multivariate losses.}
}

Endnote

%0 Conference Paper
%T Optimal Classification with Multivariate Losses
%A Nagarajan Natarajan
%A Oluwasanmi Koyejo
%A Pradeep Ravikumar
%A Inderjit Dhillon
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-natarajan16
%I PMLR
%P 1530--1538
%U https://proceedings.mlr.press/v48/natarajan16.html
%V 48
%X Multivariate loss functions are extensively employed in several prediction tasks arising in Information Retrieval. Often, the goal in the tasks is to minimize expected loss when retrieving relevant items from a presented set of items, where the expectation is with respect to the joint distribution over item sets. Our key result is that for most multivariate losses, the expected loss is provably optimized by sorting the items by the conditional probability of label being positive and then selecting top k items. Such a result was previously known only for the F-measure. Leveraging on the optimality characterization, we give an algorithm for estimating optimal predictions in practice with runtime quadratic in size of item sets for many losses. We provide empirical results on benchmark datasets, comparing the proposed algorithm to state-of-the-art methods for optimizing multivariate losses.

RIS


TY  - CPAPER
TI  - Optimal Classification with Multivariate Losses
AU  - Nagarajan Natarajan
AU  - Oluwasanmi Koyejo
AU  - Pradeep Ravikumar
AU  - Inderjit Dhillon
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-natarajan16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1530
EP  - 1538
L1  - http://proceedings.mlr.press/v48/natarajan16.pdf
UR  - https://proceedings.mlr.press/v48/natarajan16.html
AB  - Multivariate loss functions are extensively employed in several prediction tasks arising in Information Retrieval. Often, the goal in the tasks is to minimize expected loss when retrieving relevant items from a presented set of items, where the expectation is with respect to the joint distribution over item sets. Our key result is that for most multivariate losses, the expected loss is provably optimized by sorting the items by the conditional probability of label being positive and then selecting top k items. Such a result was previously known only for the F-measure. Leveraging on the optimality characterization, we give an algorithm for estimating optimal predictions in practice with runtime quadratic in size of item sets for many losses. We provide empirical results on benchmark datasets, comparing the proposed algorithm to state-of-the-art methods for optimizing multivariate losses.
ER  -

APA


Natarajan, N., Koyejo, O., Ravikumar, P. & Dhillon, I.. (2016). Optimal Classification with Multivariate Losses. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1530-1538 Available from https://proceedings.mlr.press/v48/natarajan16.html.

Optimal Classification with Multivariate Losses

Abstract

Cite this Paper

Related Material