CurBench: Curriculum Learning Benchmark

Yuwei Zhou, Zirui Pan, Xin Wang, Hong Chen, Haoyang Li, Yanwen Huang, Zhixiao Xiong, Fangzhou Xiong, Peiyang Xu, Shengnan Liu, Wenwu Zhu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:62088-62107, 2024.

Abstract

Curriculum learning is a training paradigm where machine learning models are trained in a meaningful order, inspired by the way humans learn curricula. Due to its capability to improve model generalization and convergence, curriculum learning has gained considerable attention and has been widely applied to various research domains. Nevertheless, as new curriculum learning methods continue to emerge, it remains an open issue to benchmark them fairly. Therefore, we develop CurBench, the first benchmark that supports systematic evaluations for curriculum learning. Specifically, it consists of 15 datasets spanning 3 research domains: computer vision, natural language processing, and graph machine learning, along with 3 settings: standard, noise, and imbalance. To facilitate a comprehensive comparison, we establish the evaluation from 2 dimensions: performance and complexity. CurBench also provides a unified toolkit that plugs automatic curricula into general machine learning processes, enabling the implementation of 15 core curriculum learning methods. On the basis of this benchmark, we conduct comparative experiments and make empirical analyses of existing methods. CurBench is open-source and publicly available at https://github.com/THUMNLab/CurBench.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-zhou24o, title = {{C}ur{B}ench: Curriculum Learning Benchmark}, author = {Zhou, Yuwei and Pan, Zirui and Wang, Xin and Chen, Hong and Li, Haoyang and Huang, Yanwen and Xiong, Zhixiao and Xiong, Fangzhou and Xu, Peiyang and Liu, Shengnan and Zhu, Wenwu}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {62088--62107}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/zhou24o/zhou24o.pdf}, url = {https://proceedings.mlr.press/v235/zhou24o.html}, abstract = {Curriculum learning is a training paradigm where machine learning models are trained in a meaningful order, inspired by the way humans learn curricula. Due to its capability to improve model generalization and convergence, curriculum learning has gained considerable attention and has been widely applied to various research domains. Nevertheless, as new curriculum learning methods continue to emerge, it remains an open issue to benchmark them fairly. Therefore, we develop CurBench, the first benchmark that supports systematic evaluations for curriculum learning. Specifically, it consists of 15 datasets spanning 3 research domains: computer vision, natural language processing, and graph machine learning, along with 3 settings: standard, noise, and imbalance. To facilitate a comprehensive comparison, we establish the evaluation from 2 dimensions: performance and complexity. CurBench also provides a unified toolkit that plugs automatic curricula into general machine learning processes, enabling the implementation of 15 core curriculum learning methods. On the basis of this benchmark, we conduct comparative experiments and make empirical analyses of existing methods. CurBench is open-source and publicly available at https://github.com/THUMNLab/CurBench.} }
Endnote
%0 Conference Paper %T CurBench: Curriculum Learning Benchmark %A Yuwei Zhou %A Zirui Pan %A Xin Wang %A Hong Chen %A Haoyang Li %A Yanwen Huang %A Zhixiao Xiong %A Fangzhou Xiong %A Peiyang Xu %A Shengnan Liu %A Wenwu Zhu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-zhou24o %I PMLR %P 62088--62107 %U https://proceedings.mlr.press/v235/zhou24o.html %V 235 %X Curriculum learning is a training paradigm where machine learning models are trained in a meaningful order, inspired by the way humans learn curricula. Due to its capability to improve model generalization and convergence, curriculum learning has gained considerable attention and has been widely applied to various research domains. Nevertheless, as new curriculum learning methods continue to emerge, it remains an open issue to benchmark them fairly. Therefore, we develop CurBench, the first benchmark that supports systematic evaluations for curriculum learning. Specifically, it consists of 15 datasets spanning 3 research domains: computer vision, natural language processing, and graph machine learning, along with 3 settings: standard, noise, and imbalance. To facilitate a comprehensive comparison, we establish the evaluation from 2 dimensions: performance and complexity. CurBench also provides a unified toolkit that plugs automatic curricula into general machine learning processes, enabling the implementation of 15 core curriculum learning methods. On the basis of this benchmark, we conduct comparative experiments and make empirical analyses of existing methods. CurBench is open-source and publicly available at https://github.com/THUMNLab/CurBench.
APA
Zhou, Y., Pan, Z., Wang, X., Chen, H., Li, H., Huang, Y., Xiong, Z., Xiong, F., Xu, P., Liu, S. & Zhu, W.. (2024). CurBench: Curriculum Learning Benchmark. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:62088-62107 Available from https://proceedings.mlr.press/v235/zhou24o.html.

Related Material