On Using Nearly-Independent Feature Families for High Precision and Confidence

Omid Madani; Manfred Georg; David A. Ross

On Using Nearly-Independent Feature Families for High Precision and Confidence

Omid Madani, Manfred Georg, David A. Ross

Proceedings of the Asian Conference on Machine Learning, PMLR 25:269-284, 2012.

Abstract

Often we require classification at a very high precision level, such as 99%. We report that when very different sources of evidence such as text, audio, and video features are available, combining the outputs of base classifiers trained on each feature type separately, aka late fusion, can substantially increase the recall of the combination at high precisions, compared to the performance of a single classifier trained on all the feature types i.e., early fusion, or compared to the individual base classifiers. We show how the probability of a joint false-positive mistake can be upper bounded by the product of individual probabilities of conditional false-positive mistakes, by identifying a simple key criterion that needs to hold. This provides an explanation for the high precision phenomenon, and motivates referring to such feature families as (nearly) independent. We assess the relevant factors for achieving high precision empirically, and explore combination techniques informed by the analysis. We compare a number of early and late fusion methods, and observe that classifier combination via late fusion can more than double the recall at high precision.

Cite this Paper

BibTeX


@InProceedings{pmlr-v25-madani12,
  title = 	 {On Using Nearly-Independent Feature Families for High Precision and Confidence},
  author = 	 {Madani, Omid and Georg, Manfred and Ross, David A.},
  booktitle = 	 {Proceedings of the Asian Conference on Machine Learning},
  pages = 	 {269--284},
  year = 	 {2012},
  editor = 	 {Hoi, Steven C. H. and Buntine, Wray},
  volume = 	 {25},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Singapore Management University, Singapore},
  month = 	 {04--06 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v25/madani12/madani12.pdf},
  url = 	 {https://proceedings.mlr.press/v25/madani12.html},
  abstract = 	 {Often we require classification at a very high precision level, such as 99%. We report that when very different sources of evidence such as text, audio, and video features are available, combining the outputs of base classifiers trained on each feature type separately, aka late fusion, can substantially increase the recall of the combination at high precisions, compared to the performance of a single classifier trained on all the feature types i.e., early fusion, or compared to the individual base classifiers. We show how the probability of a joint false-positive mistake can be upper bounded by the product of individual probabilities of conditional false-positive mistakes, by identifying a simple key criterion that needs to hold. This provides an explanation for the high precision phenomenon, and motivates referring to such feature families as (nearly) independent. We assess the relevant factors for achieving high precision empirically, and explore combination techniques informed by the analysis. We compare a number of early and late fusion methods, and observe that classifier combination via late fusion can more than double the recall at high precision.}
}

Endnote

%0 Conference Paper
%T On Using Nearly-Independent Feature Families for High Precision and Confidence
%A Omid Madani
%A Manfred Georg
%A David A. Ross
%B Proceedings of the Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2012
%E Steven C. H. Hoi
%E Wray Buntine	
%F pmlr-v25-madani12
%I PMLR
%P 269--284
%U https://proceedings.mlr.press/v25/madani12.html
%V 25
%X Often we require classification at a very high precision level, such as 99%. We report that when very different sources of evidence such as text, audio, and video features are available, combining the outputs of base classifiers trained on each feature type separately, aka late fusion, can substantially increase the recall of the combination at high precisions, compared to the performance of a single classifier trained on all the feature types i.e., early fusion, or compared to the individual base classifiers. We show how the probability of a joint false-positive mistake can be upper bounded by the product of individual probabilities of conditional false-positive mistakes, by identifying a simple key criterion that needs to hold. This provides an explanation for the high precision phenomenon, and motivates referring to such feature families as (nearly) independent. We assess the relevant factors for achieving high precision empirically, and explore combination techniques informed by the analysis. We compare a number of early and late fusion methods, and observe that classifier combination via late fusion can more than double the recall at high precision.

RIS


TY  - CPAPER
TI  - On Using Nearly-Independent Feature Families for High Precision and Confidence
AU  - Omid Madani
AU  - Manfred Georg
AU  - David A. Ross
BT  - Proceedings of the Asian Conference on Machine Learning
DA  - 2012/11/17
ED  - Steven C. H. Hoi
ED  - Wray Buntine	
ID  - pmlr-v25-madani12
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 25
SP  - 269
EP  - 284
L1  - http://proceedings.mlr.press/v25/madani12/madani12.pdf
UR  - https://proceedings.mlr.press/v25/madani12.html
AB  - Often we require classification at a very high precision level, such as 99%. We report that when very different sources of evidence such as text, audio, and video features are available, combining the outputs of base classifiers trained on each feature type separately, aka late fusion, can substantially increase the recall of the combination at high precisions, compared to the performance of a single classifier trained on all the feature types i.e., early fusion, or compared to the individual base classifiers. We show how the probability of a joint false-positive mistake can be upper bounded by the product of individual probabilities of conditional false-positive mistakes, by identifying a simple key criterion that needs to hold. This provides an explanation for the high precision phenomenon, and motivates referring to such feature families as (nearly) independent. We assess the relevant factors for achieving high precision empirically, and explore combination techniques informed by the analysis. We compare a number of early and late fusion methods, and observe that classifier combination via late fusion can more than double the recall at high precision.
ER  -

APA


Madani, O., Georg, M. & Ross, D.A.. (2012). On Using Nearly-Independent Feature Families for High Precision and Confidence. Proceedings of the Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 25:269-284 Available from https://proceedings.mlr.press/v25/madani12.html.

Related Material

Download PDF