Sampling a Longer Life: Binary versus One-class classification Revisited

Colin Bellinger, Shiven Sharma, Osmar R. Zaı̈ane, Nathalie Japkowicz
Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 74:64-78, 2017.

Abstract

When faced with imbalanced domains, practitioners have one of two choices; if the imbalance is manageable, sampling or other corrective measures can be utilized in conjunction with binary classifiers (BCs). Beyond a certain point, however, the imbalance becomes too extreme and one-class classifiers (OCCs) are required. Whilst the literature offers many advances in terms of algorithms and understanding, there remains a need to connect our theoretical advances to the most practical of decisions. Specifically, given a dataset with some level of complexity and imbalance, which classification approach should be applied? In this paper, we establish a relationship between these facets in order to help guide the decision regarding when to apply OCC versus BC. Our results show that sampling provides an edge over OCCs on complex domains. Alternatively, OCCs are a good choice on less complex domains that exhibit unimodal properties. Class overlap, on the other hand, has a more uniform impact across all methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v74-bellinger17a, title = {Sampling a Longer Life: Binary versus One-class classification Revisited}, author = {Bellinger, Colin and Sharma, Shiven and Zaı̈ane, Osmar R. and Japkowicz, Nathalie}, booktitle = {Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications}, pages = {64--78}, year = {2017}, editor = {Luís Torgo, Paula Branco and Moniz, Nuno}, volume = {74}, series = {Proceedings of Machine Learning Research}, month = {22 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v74/bellinger17a/bellinger17a.pdf}, url = {https://proceedings.mlr.press/v74/bellinger17a.html}, abstract = {When faced with imbalanced domains, practitioners have one of two choices; if the imbalance is manageable, sampling or other corrective measures can be utilized in conjunction with binary classifiers (BCs). Beyond a certain point, however, the imbalance becomes too extreme and one-class classifiers (OCCs) are required. Whilst the literature offers many advances in terms of algorithms and understanding, there remains a need to connect our theoretical advances to the most practical of decisions. Specifically, given a dataset with some level of complexity and imbalance, which classification approach should be applied? In this paper, we establish a relationship between these facets in order to help guide the decision regarding when to apply OCC versus BC. Our results show that sampling provides an edge over OCCs on complex domains. Alternatively, OCCs are a good choice on less complex domains that exhibit unimodal properties. Class overlap, on the other hand, has a more uniform impact across all methods.} }
Endnote
%0 Conference Paper %T Sampling a Longer Life: Binary versus One-class classification Revisited %A Colin Bellinger %A Shiven Sharma %A Osmar R. Zaı̈ane %A Nathalie Japkowicz %B Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications %C Proceedings of Machine Learning Research %D 2017 %E Paula Branco Luís Torgo %E Nuno Moniz %F pmlr-v74-bellinger17a %I PMLR %P 64--78 %U https://proceedings.mlr.press/v74/bellinger17a.html %V 74 %X When faced with imbalanced domains, practitioners have one of two choices; if the imbalance is manageable, sampling or other corrective measures can be utilized in conjunction with binary classifiers (BCs). Beyond a certain point, however, the imbalance becomes too extreme and one-class classifiers (OCCs) are required. Whilst the literature offers many advances in terms of algorithms and understanding, there remains a need to connect our theoretical advances to the most practical of decisions. Specifically, given a dataset with some level of complexity and imbalance, which classification approach should be applied? In this paper, we establish a relationship between these facets in order to help guide the decision regarding when to apply OCC versus BC. Our results show that sampling provides an edge over OCCs on complex domains. Alternatively, OCCs are a good choice on less complex domains that exhibit unimodal properties. Class overlap, on the other hand, has a more uniform impact across all methods.
APA
Bellinger, C., Sharma, S., Zaı̈ane, O.R. & Japkowicz, N.. (2017). Sampling a Longer Life: Binary versus One-class classification Revisited . Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications, in Proceedings of Machine Learning Research 74:64-78 Available from https://proceedings.mlr.press/v74/bellinger17a.html.

Related Material