- title: 'Cost-Sensitive Learning: Preface'
volume: 88
URL: https://proceedings.mlr.press/v88/torgo18a.html
PDF: http://proceedings.mlr.press/v88/torgo18a/torgo18a.pdf
edit: https://github.com/mlresearch//v88/edit/gh-pages/_posts/2018-08-01-torgo18a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of The International Workshop on Cost-Sensitive Learning'
publisher: 'PMLR'
author:
- given: Luís
family: Torgo
- given: Stan
family: Matwin
- given: Gary
family: Weiss
- given: Nuno
family: Moniz
- given: Paula
family: Branco
editor:
- given: Luís
family: Torgo
- given: Stan
family: Matwin
- given: Gary
family: Weiss
- given: Nuno
family: Moniz
- given: Paula
family: Branco
page: 1-3
id: torgo18a
issued:
date-parts:
- 2018
- 8
- 1
firstpage: 1
lastpage: 3
published: 2018-08-01 00:00:00 +0000
- title: 'Classifier Performance Estimation with Unbalanced, Partially Labeled Data'
abstract: 'Class imbalance and lack of ground truth are two significant problems in modern machine learning research. These problems are especially pressing in operational contexts where the total number of data points is extremely large and the cost of obtaining labels is very high. In the face of these issues, accurate estimation of the performance of a detection or classification system is crucial to inform decisions based on the observations. This paper presents a framework for estimating performance of a binary classifier in such a context. We focus on the scenario where each set of measurements has been reduced to a score, and the operator only investigates data when the score exceeds a threshold. The operator is blind to the number of missed detections, so performance estimation targets two quantities: recall and the derivative of precision with respect to recall. Measuring with respect to error in these two metrics, simulations in this context demonstrate that labeling outliers not only outperforms random labeling, but often matches performance of an adaptive method that attempts to choose the optimal data for labeling. Application to real anomaly detection data confirms the utility of the approach, and suggests direction for future work.'
volume: 88
URL: https://proceedings.mlr.press/v88/miller18a.html
PDF: http://proceedings.mlr.press/v88/miller18a/miller18a.pdf
edit: https://github.com/mlresearch//v88/edit/gh-pages/_posts/2018-08-01-miller18a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of The International Workshop on Cost-Sensitive Learning'
publisher: 'PMLR'
author:
- given: Benjamin A.
family: Miller
- given: Jeremy
family: Vila
- given: Malina
family: Kirn
- given: Joseph R.
family: Zipkin
editor:
- given: Luís
family: Torgo
- given: Stan
family: Matwin
- given: Gary
family: Weiss
- given: Nuno
family: Moniz
- given: Paula
family: Branco
page: 4-16
id: miller18a
issued:
date-parts:
- 2018
- 8
- 1
firstpage: 4
lastpage: 16
published: 2018-08-01 00:00:00 +0000
- title: 'Cost-sensitive Classifier Selection when there is Additional Cost Information'
abstract: 'Machine learning models are increasing in popularity in many domains as they are shown to be able to solve difficult problems. However, selecting a model to implement when there are various alternatives is a difficult problem. Receiver operating characteristic (ROC) curves are useful for selecting binary classification models for real world problems. However, ROC curves only consider the misclassification cost of the classifier. The total cost of a classification system includes various other types of cost including implementation, computation, and feature costs. To extend the ROC analysis to include this additional cost information, the ROC Convex Hull with Cost (ROCCHC) method is introduced. This method extends the ROC Convex Hull (ROCCH) method, which is used to select potentially optimal classifiers in the ROC space using misclassification cost, by selecting potentially optimal classifiers using this additional cost information. The ROCCHC method is tested using three binary classification data sets, each of which include real feature costs as the additional cost information. Competing classifiers are created with the CART algorithm by using each combination of features or sensors for each data set. The ROCCHC method reduces the classifier decision space to 4%, 9%, and 0.02%. These results are compared to the current ROCCH method, which misses 91%, 58%, and 6% of potentially optimal classifiers because the method does not include the additional cost information.'
volume: 88
URL: https://proceedings.mlr.press/v88/meekins18a.html
PDF: http://proceedings.mlr.press/v88/meekins18a/meekins18a.pdf
edit: https://github.com/mlresearch//v88/edit/gh-pages/_posts/2018-08-01-meekins18a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of The International Workshop on Cost-Sensitive Learning'
publisher: 'PMLR'
author:
- given: Ryan
family: Meekins
- given: Stephen
family: Adams
- given: Peter A.
family: Beling
- given: Kevin
family: Farinholt
- given: Nathan
family: Hipwell
- given: Ali
family: Chaudhry
- given: Sherwood
family: Polter
- given: Qing
family: Dong
editor:
- given: Luís
family: Torgo
- given: Stan
family: Matwin
- given: Gary
family: Weiss
- given: Nuno
family: Moniz
- given: Paula
family: Branco
page: 17-30
id: meekins18a
issued:
date-parts:
- 2018
- 8
- 1
firstpage: 17
lastpage: 30
published: 2018-08-01 00:00:00 +0000
- title: 'Recognizing Cuneiform Signs Using Graph Based Methods'
abstract: 'The cuneiform script constitutes one of the earliest systems of
writing and is realized by wedge-shaped marks on clay tablets. A
tremendous number of cuneiform tablets have already been discovered
and are incrementally digitalized and made available to automated
processing. As reading cuneiform script is still a manual task, we
address the real-world application of recognizing cuneiform signs by
two graph based methods with complementary runtime
characteristics. We present a graph model for cuneiform signs
together with a tailored distance measure based on the concept of
the graph edit distance. We propose efficient heuristics for its
computation and demonstrate its effectiveness in classification
tasks experimentally. To this end, the distance measure is used to
implement a nearest neighbor classifier leading to a high
computational cost for the prediction phase with increasing training
set size. In order to overcome this issue, we propose to use CNNs
adapted to graphs as an alternative approach shifting the
computational cost to the training phase. We demonstrate the
practicability of both approaches in an experimental comparison
regarding runtime and prediction accuracy. Although currently
available annotated real-world data is still limited, we obtain a
high accuracy using CNNs, in particular, when the training set is
enriched by augmented examples.
'
volume: 88
URL: https://proceedings.mlr.press/v88/kriege18a.html
PDF: http://proceedings.mlr.press/v88/kriege18a/kriege18a.pdf
edit: https://github.com/mlresearch//v88/edit/gh-pages/_posts/2018-08-01-kriege18a.md
series: 'Proceedings of Machine Learning Research'
container-title: 'Proceedings of The International Workshop on Cost-Sensitive Learning'
publisher: 'PMLR'
author:
- given: Nils M.
family: Kriege
- given: Matthias
family: Fey
- given: Denis
family: Fisseler
- given: Petra
family: Mutzel
- given: Frank
family: Weichert
editor:
- given: Luís
family: Torgo
- given: Stan
family: Matwin
- given: Gary
family: Weiss
- given: Nuno
family: Moniz
- given: Paula
family: Branco
page: 31-44
id: kriege18a
issued:
date-parts:
- 2018
- 8
- 1
firstpage: 31
lastpage: 44
published: 2018-08-01 00:00:00 +0000