- title: '2nd Workshop on Learning with Imbalanced Domains: Preface'
  volume: 94
  URL: https://proceedings.mlr.press/v94/torgo18a.html
  PDF: http://proceedings.mlr.press/v94/torgo18a/torgo18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-torgo18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 1-7
  id: torgo18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 1
  lastpage: 7
  published: 2018-11-05 00:00:00 +0000
- title: 'Learning from Positive and Unlabeled Data under the Selected At Random Assumption'
  abstract: 'For many interesting tasks, such as medical diagnosis and web page classification, a learner only has access to some positively labeled examples and many unlabeled examples. Learning from this type of data requires making assumptions about the true distribution of the classes and/or the mechanism that was used to select the positive examples to be labeled. The commonly made assumptions, separability of the classes and positive examples being selected completely at random, are very strong. This paper proposes a weaker assumption that assumes the positive examples to be selected at random, conditioned on some of the attributes. To learn under this assumption, an EM method is proposed. Experiments show that our method is not only very capable of learning under this assumption, but it also outperforms the state of the art for learning under the selected completely at random assumption.'
  volume: 94
  URL: https://proceedings.mlr.press/v94/bekker18a.html
  PDF: http://proceedings.mlr.press/v94/bekker18a/bekker18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-bekker18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: Jessa
    family: Bekker
  - given: Jesse
    family: Davis
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 8-22
  id: bekker18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 8
  lastpage: 22
  published: 2018-11-05 00:00:00 +0000
- title: 'Multi-label kNN Classifier with Self Adjusting Memory for Drifting Data Streams'
  abstract: 'Multi-label data streams is a highly challenging task involving drifts in features and labels. Classifiers must automatically adapt to changes while keeping a competitive accuracy in a real-time dynamic environment where the frequencies of the labelsets are non-stationary and highly imbalanced. This paper presents a multi-label k Nearest Neighbor (kNN) with Self Adjusting Memory (SAM) for drifting data streams (ML-SAM-kNN). It exploits short- and long-term memories to predict the current and evolving states of the data stream. The experimental study compares the proposal with eight other multi-label classifiers for data streams on 23 datasets on six multi-label metrics, evaluation time, and memory consumption. Non-parametric statistical analysis of the results shows the superiority of ML-SAM-kNN, including when compared with ML-kNN.'
  volume: 94
  URL: https://proceedings.mlr.press/v94/roseberry18a.html
  PDF: http://proceedings.mlr.press/v94/roseberry18a/roseberry18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-roseberry18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: Martha
    family: Roseberry
  - given: Alberto
    family: Cano
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 23-37
  id: roseberry18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 23
  lastpage: 37
  published: 2018-11-05 00:00:00 +0000
- title: 'Non-Linear Gradient Boosting for Class-Imbalance Learning'
  abstract: 'Gradient boosting relies on linearly combining diverse and weak hypotheses to build a strong classifier. In the class imbalance setting, boosting algorithms often require many hypotheses which tend to be more complex and may increase the risk of overfitting.  We propose in this paper to address this issue by adapting the gradient boosting framework to a non-linear setting. In order to learn the idiosyncrasies of the target concept and prevent the algorithm from being biased toward the majority class, we suggest to jointly learn different combinations of the same set of very weak classifiers and expand the expressiveness of the final model by leveraging their non-linear complementarity. We perform an extensive experimental study using decision trees and show that, while requiring much less weak learners with a lower complexity (fewer splits per tree), our model outperforms standard linear gradient boosting.'
  volume: 94
  URL: https://proceedings.mlr.press/v94/frery18a.html
  PDF: http://proceedings.mlr.press/v94/frery18a/frery18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-frery18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: Jordan
    family: Frery
  - given: Amaury
    family: Habrard
  - given: Marc
    family: Sebban
  - given: Liyun
    family: He-Guelton
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 38-51
  id: frery18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 38
  lastpage: 51
  published: 2018-11-05 00:00:00 +0000
- title: 'Proper Losses for Learning with Example-Dependent Costs'
  abstract: 'We study the design of cost-sensitive learning algorithms with example-dependent costs, when cost matrices for each example are given both during training and test. The approach is based on the empirical risk minimization framework, where we replace the standard loss function by a combination of surrogate losses belonging to the family of proper losses. The actual contribution of each example to the risk is then given by a loss that depends on the cost matrix for the specific example. We then evaluate the use of such example-dependent loss functions in real-world binary and multiclass problems, namely credit risk assessment and musical genre classification. Using different neural network architectures, we show that with the appropriate choice of the example-dependent losses, we can outperform conventional cost-sensitive methods in terms of total cost, making a more efficient use of cost information during training and test as compared to existing discriminative approaches.'
  volume: 94
  URL: https://proceedings.mlr.press/v94/hepburn18a.html
  PDF: http://proceedings.mlr.press/v94/hepburn18a/hepburn18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-hepburn18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: Alexander
    family: Hepburn
  - given: Ryan
    family: McConville
  - given: Raúl
    family: Santos-Rodríguezo
  - given: Jesús
    family: Cid-Sueiro
  - given: Dario
    family: García-García
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 52-66
  id: hepburn18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 52
  lastpage: 66
  published: 2018-11-05 00:00:00 +0000
- title: 'REBAGG: REsampled BAGGing for Imbalanced Regression'
  abstract: 'The problem of imbalanced domains is important in multiple real world applications. This problem has been thoroughly studied for classification tasks. In particular, the adaptation of ensembles to tackle imbalanced domains has shown important advantages in a classification context. Still, for imbalanced regression problems only a few solutions exist. Moreover, the capabilities of ensembles for dealing with imbalanced regression tasks is yet to be explored. In this paper we present the REsampled BAGGing (REBAGG) algorithm, a bagging-based ensemble method that incorporates data pre-processing strategies for addressing imbalanced domains in regression tasks. The extensive experimental evaluation conducted shows the advantage of our proposal in a diverse set of domains and learning algorithms.'
  volume: 94
  URL: https://proceedings.mlr.press/v94/branco18a.html
  PDF: http://proceedings.mlr.press/v94/branco18a/branco18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-branco18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: Paula
    family: Branco
  - given: Luis
    family: Torgo
  - given: Rita P.
    family: Ribeiro
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 67-81
  id: branco18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 67
  lastpage: 81
  published: 2018-11-05 00:00:00 +0000
- title: 'Undersampled Majority Class Ensemble for highly imbalanced binary classification'
  abstract: 'Following work tries to utilize an ensemble approach to solve a problem of highly imbalanced data classification. Paper contains a proposition of umce – a multiple classifier system, based on k-fold division of the majority class to create a pool of classifiers breaking one imbalanced problem into many balanced ones while ensuring the presence of all available samples in the training procedure. Algorithm, with five proposed fusers and a pruning method based on the statistical dependencies of the classifiers response on the testing set, was evaluated on the basis of the computer experiments carried out on the benchmark datasets and two different base classifiers.'
  volume: 94
  URL: https://proceedings.mlr.press/v94/ksieniewicz18a.html
  PDF: http://proceedings.mlr.press/v94/ksieniewicz18a/ksieniewicz18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-ksieniewicz18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: Pawel
    family: Ksieniewicz
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 82-94
  id: ksieniewicz18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 82
  lastpage: 94
  published: 2018-11-05 00:00:00 +0000
- title: 'ImWeights: Classifying Imbalanced Data Using Local and Neighborhood Information'
  abstract: 'Preprocessing methods for imbalanced data transform the training data to a form more suitable for learning classifiers. Most of these methods either focus on local relationships between single training examples or analyze the global characteristics of the data, such as the class imbalance ratio in the dataset. However, they do not sufficiently exploit the combination of both these views. In this paper, we put forward a new data preprocessing method called ImWeights, which weights training examples according to their local difficulty (safety) and the vicinity of larger minority clusters (gravity). Experiments with real-world datasets show that ImWeights is on par with local and global preprocessing methods, while being the least memory intensive. The introduced notion of minority cluster gravity opens new lines of research for specialized preprocessing methods and classifier modifications for imbalanced data.'
  volume: 94
  URL: https://proceedings.mlr.press/v94/lango18a.html
  PDF: http://proceedings.mlr.press/v94/lango18a/lango18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-lango18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: Mateusz
    family: Lango
  - given: Dariusz
    family: Brzezinski
  - given: Jerzy
    family: Stefanowski
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 95-109
  id: lango18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 95
  lastpage: 109
  published: 2018-11-05 00:00:00 +0000
- title: 'On the Need of Class Ratio Insensitive Drift Tests for Data Streams'
  abstract: 'Early approaches to detect concept drifts in data streams without actual class labels aim at minimizing external labeling costs. However, their functionality is dubious when presented with changes in the proportion of the classes over time, as such methods keep reporting concept drifts that would not damage the performance of the current classification model. In this paper, we present an approach that can detect changes in the distribution of the features that is insensitive to changes in the distribution of the classes. The method also provides an estimate of the current class ratio and use it to adapt the threshold of a classification model trained with a balanced data. We show that the classification performance achieved by such a modified classifier is greater than that of a classifier trained with the same class distribution as the current imbalanced data.'
  volume: 94
  URL: https://proceedings.mlr.press/v94/maletzke18a.html
  PDF: http://proceedings.mlr.press/v94/maletzke18a/maletzke18a.pdf
  edit: https://github.com/mlresearch//v94/edit/gh-pages/_posts/2018-11-05-maletzke18a.md
  series: 'Proceedings of Machine Learning Research'
  container-title: 'Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications'
  publisher: 'PMLR'
  author: 
  - given: André
    family: Maletzke
  - given: Denis
    family: Reis
  - given: Everton
    family: Cherman
  - given: Gustavo
    family: Batista
  editor: 
  - given: Luís
    family: Torgo
  - given: Stan
    family: Matwin
  - given: Nathalie
    family: Japkowicz
  - given: Bartosz
    family: Krawczyk
  - given: Nuno
    family: Moniz
  - given: Paula
    family: Branco
  page: 110-124
  id: maletzke18a
  issued:
    date-parts: 
      - 2018
      - 11
      - 5
  firstpage: 110
  lastpage: 124
  published: 2018-11-05 00:00:00 +0000