Meta Learning in the Continuous Time Limit

Ruitu Xu, Lin Chen, Amin Karbasi
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3052-3060, 2021.

Abstract

In this paper, we establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-Agnostic Meta-Learning (MAML). Our continuous-time limit view of the process eliminates the influence of the manually chosen step size of gradient descent and includes the existing gradient descent training algorithm as a special case that results from a specific discretization. We show that the MAML ODE enjoys a linear convergence rate to an approximate stationary point of the MAML loss function for strongly convex task losses, even when the corresponding MAML loss is non-convex. Moreover, through the analysis of the MAML ODE, we propose a new BI-MAML training algorithm that reduces the computational burden associated with existing MAML training methods, and empirical experiments are performed to showcase the superiority of our proposed methods in the rate of convergence with respect to the vanilla MAML algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-xu21g, title = { Meta Learning in the Continuous Time Limit }, author = {Xu, Ruitu and Chen, Lin and Karbasi, Amin}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {3052--3060}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/xu21g/xu21g.pdf}, url = {https://proceedings.mlr.press/v130/xu21g.html}, abstract = { In this paper, we establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-Agnostic Meta-Learning (MAML). Our continuous-time limit view of the process eliminates the influence of the manually chosen step size of gradient descent and includes the existing gradient descent training algorithm as a special case that results from a specific discretization. We show that the MAML ODE enjoys a linear convergence rate to an approximate stationary point of the MAML loss function for strongly convex task losses, even when the corresponding MAML loss is non-convex. Moreover, through the analysis of the MAML ODE, we propose a new BI-MAML training algorithm that reduces the computational burden associated with existing MAML training methods, and empirical experiments are performed to showcase the superiority of our proposed methods in the rate of convergence with respect to the vanilla MAML algorithm. } }
Endnote
%0 Conference Paper %T Meta Learning in the Continuous Time Limit %A Ruitu Xu %A Lin Chen %A Amin Karbasi %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-xu21g %I PMLR %P 3052--3060 %U https://proceedings.mlr.press/v130/xu21g.html %V 130 %X In this paper, we establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-Agnostic Meta-Learning (MAML). Our continuous-time limit view of the process eliminates the influence of the manually chosen step size of gradient descent and includes the existing gradient descent training algorithm as a special case that results from a specific discretization. We show that the MAML ODE enjoys a linear convergence rate to an approximate stationary point of the MAML loss function for strongly convex task losses, even when the corresponding MAML loss is non-convex. Moreover, through the analysis of the MAML ODE, we propose a new BI-MAML training algorithm that reduces the computational burden associated with existing MAML training methods, and empirical experiments are performed to showcase the superiority of our proposed methods in the rate of convergence with respect to the vanilla MAML algorithm.
APA
Xu, R., Chen, L. & Karbasi, A.. (2021). Meta Learning in the Continuous Time Limit . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:3052-3060 Available from https://proceedings.mlr.press/v130/xu21g.html.

Related Material