On the Statistical Benefits of Curriculum Learning

Ziping Xu, Ambuj Tewari
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:24663-24682, 2022.

Abstract

Curriculum learning (CL) is a commonly used machine learning training strategy. However, we still lack a clear theoretical understanding of CL’s benefits. In this paper, we study the benefits of CL in the multitask linear regression problem under both structured and unstructured settings. For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum. Our results reveal that adaptive learning can be fundamentally harder than the oracle learning in the unstructured setting, but it merely introduces a small extra term in the structured setting. To connect theory with practice, we provide justification for a popular empirical method that selects tasks with highest local prediction gain by comparing its guarantees with the minimax rates mentioned above.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-xu22i, title = {On the Statistical Benefits of Curriculum Learning}, author = {Xu, Ziping and Tewari, Ambuj}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {24663--24682}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/xu22i/xu22i.pdf}, url = {https://proceedings.mlr.press/v162/xu22i.html}, abstract = {Curriculum learning (CL) is a commonly used machine learning training strategy. However, we still lack a clear theoretical understanding of CL’s benefits. In this paper, we study the benefits of CL in the multitask linear regression problem under both structured and unstructured settings. For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum. Our results reveal that adaptive learning can be fundamentally harder than the oracle learning in the unstructured setting, but it merely introduces a small extra term in the structured setting. To connect theory with practice, we provide justification for a popular empirical method that selects tasks with highest local prediction gain by comparing its guarantees with the minimax rates mentioned above.} }
Endnote
%0 Conference Paper %T On the Statistical Benefits of Curriculum Learning %A Ziping Xu %A Ambuj Tewari %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-xu22i %I PMLR %P 24663--24682 %U https://proceedings.mlr.press/v162/xu22i.html %V 162 %X Curriculum learning (CL) is a commonly used machine learning training strategy. However, we still lack a clear theoretical understanding of CL’s benefits. In this paper, we study the benefits of CL in the multitask linear regression problem under both structured and unstructured settings. For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum. Our results reveal that adaptive learning can be fundamentally harder than the oracle learning in the unstructured setting, but it merely introduces a small extra term in the structured setting. To connect theory with practice, we provide justification for a popular empirical method that selects tasks with highest local prediction gain by comparing its guarantees with the minimax rates mentioned above.
APA
Xu, Z. & Tewari, A.. (2022). On the Statistical Benefits of Curriculum Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:24663-24682 Available from https://proceedings.mlr.press/v162/xu22i.html.

Related Material