AUC Maximization in Imbalanced Lifelong Learning

Xiangyu Zhu, Jie Hao, Yunhui Guo, Mingrui Liu
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:2574-2585, 2023.

Abstract

Imbalanced data is ubiquitous in machine learning, such as medical or fine-grained image datasets. The existing continual learning methods employ various techniques such as balanced sampling to improve classification accuracy in this setting. However, classification accuracy is not a suitable metric for imbalanced data, and hence these methods may not obtain a good classifier as measured by other metrics (e.g., Area under the ROC Curve). In this paper, we propose a solution to enable efficient imbalanced continual learning by designing an algorithm to effectively maximize one widely used metric in an imbalanced data setting: Area Under the ROC Curve (AUC). We find that simply replacing accuracy with AUC will cause gradient interference problem due to the imbalanced data distribution. To address this issue, we propose a new algorithm, namely DIANA, which performs a novel synthesis of model DecouplIng ANd Alignment. In particular, the algorithm updates two models simultaneously: one focuses on learning the current knowledge while the other concentrates on reviewing previously-learned knowledge, and the two models gradually align during training. The results show that the proposed DIANA achieves state-of-the-art performance on all the imbalanced datasets compared with several competitive baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v216-zhu23a, title = {{AUC} Maximization in Imbalanced Lifelong Learning}, author = {Zhu, Xiangyu and Hao, Jie and Guo, Yunhui and Liu, Mingrui}, booktitle = {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence}, pages = {2574--2585}, year = {2023}, editor = {Evans, Robin J. and Shpitser, Ilya}, volume = {216}, series = {Proceedings of Machine Learning Research}, month = {31 Jul--04 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v216/zhu23a/zhu23a.pdf}, url = {https://proceedings.mlr.press/v216/zhu23a.html}, abstract = {Imbalanced data is ubiquitous in machine learning, such as medical or fine-grained image datasets. The existing continual learning methods employ various techniques such as balanced sampling to improve classification accuracy in this setting. However, classification accuracy is not a suitable metric for imbalanced data, and hence these methods may not obtain a good classifier as measured by other metrics (e.g., Area under the ROC Curve). In this paper, we propose a solution to enable efficient imbalanced continual learning by designing an algorithm to effectively maximize one widely used metric in an imbalanced data setting: Area Under the ROC Curve (AUC). We find that simply replacing accuracy with AUC will cause gradient interference problem due to the imbalanced data distribution. To address this issue, we propose a new algorithm, namely DIANA, which performs a novel synthesis of model DecouplIng ANd Alignment. In particular, the algorithm updates two models simultaneously: one focuses on learning the current knowledge while the other concentrates on reviewing previously-learned knowledge, and the two models gradually align during training. The results show that the proposed DIANA achieves state-of-the-art performance on all the imbalanced datasets compared with several competitive baselines.} }
Endnote
%0 Conference Paper %T AUC Maximization in Imbalanced Lifelong Learning %A Xiangyu Zhu %A Jie Hao %A Yunhui Guo %A Mingrui Liu %B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2023 %E Robin J. Evans %E Ilya Shpitser %F pmlr-v216-zhu23a %I PMLR %P 2574--2585 %U https://proceedings.mlr.press/v216/zhu23a.html %V 216 %X Imbalanced data is ubiquitous in machine learning, such as medical or fine-grained image datasets. The existing continual learning methods employ various techniques such as balanced sampling to improve classification accuracy in this setting. However, classification accuracy is not a suitable metric for imbalanced data, and hence these methods may not obtain a good classifier as measured by other metrics (e.g., Area under the ROC Curve). In this paper, we propose a solution to enable efficient imbalanced continual learning by designing an algorithm to effectively maximize one widely used metric in an imbalanced data setting: Area Under the ROC Curve (AUC). We find that simply replacing accuracy with AUC will cause gradient interference problem due to the imbalanced data distribution. To address this issue, we propose a new algorithm, namely DIANA, which performs a novel synthesis of model DecouplIng ANd Alignment. In particular, the algorithm updates two models simultaneously: one focuses on learning the current knowledge while the other concentrates on reviewing previously-learned knowledge, and the two models gradually align during training. The results show that the proposed DIANA achieves state-of-the-art performance on all the imbalanced datasets compared with several competitive baselines.
APA
Zhu, X., Hao, J., Guo, Y. & Liu, M.. (2023). AUC Maximization in Imbalanced Lifelong Learning. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:2574-2585 Available from https://proceedings.mlr.press/v216/zhu23a.html.

Related Material