Improved Bounds for Multi-task Learning with Trace Norm Regularization

Weiwei Liu
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:700-714, 2023.

Abstract

Compared with learning each task independently, multi-task learning (MTL) is able to learn with few training samples and achieves better prediction performance. Recently, Boursier et al. (2022) study the estimation error bound for MTL with trace norm regularizer and a few observations per task. However, their results rely on three assumptions: 1) The features are isotropic; 2) The task diversity assumption is enforced to the parameters matrix; 3) The number of tasks is larger than the features dimension. Whether it is possible to drop these three assumptions and improve the bounds in Boursier et al. (2022) has remained unknown. This paper provides an affirmative answer to this question. Specifically, we reduce their upper bounds from $\tilde{\mathcal{O}}(\sigma \sqrt{\frac{rd^2/m+rT}{m}} + \sqrt{\frac{rd^2/m+rdT/m}{m}})$ to $\mathcal{O}( \sigma\sqrt{\frac{r+rd/T}{m}} )$ without three assumptions, where $T$ is the number of tasks, $d$ is the dimension of the feature space, $m$ is the number of observations per task, $r$ is the rank of ground truth matrix, $\sigma$ is the standard deviation of the noise random variable. Moreover, we provide minimax lower bounds showing our upper bounds are rateoptimal if $T =\mathcal{O}(d)$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-liu23a, title = {Improved Bounds for Multi-task Learning with Trace Norm Regularization}, author = {Liu, Weiwei}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {700--714}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/liu23a/liu23a.pdf}, url = {https://proceedings.mlr.press/v195/liu23a.html}, abstract = {Compared with learning each task independently, multi-task learning (MTL) is able to learn with few training samples and achieves better prediction performance. Recently, Boursier et al. (2022) study the estimation error bound for MTL with trace norm regularizer and a few observations per task. However, their results rely on three assumptions: 1) The features are isotropic; 2) The task diversity assumption is enforced to the parameters matrix; 3) The number of tasks is larger than the features dimension. Whether it is possible to drop these three assumptions and improve the bounds in Boursier et al. (2022) has remained unknown. This paper provides an affirmative answer to this question. Specifically, we reduce their upper bounds from $\tilde{\mathcal{O}}(\sigma \sqrt{\frac{rd^2/m+rT}{m}} + \sqrt{\frac{rd^2/m+rdT/m}{m}})$ to $\mathcal{O}( \sigma\sqrt{\frac{r+rd/T}{m}} )$ without three assumptions, where $T$ is the number of tasks, $d$ is the dimension of the feature space, $m$ is the number of observations per task, $r$ is the rank of ground truth matrix, $\sigma$ is the standard deviation of the noise random variable. Moreover, we provide minimax lower bounds showing our upper bounds are rateoptimal if $T =\mathcal{O}(d)$.} }
Endnote
%0 Conference Paper %T Improved Bounds for Multi-task Learning with Trace Norm Regularization %A Weiwei Liu %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-liu23a %I PMLR %P 700--714 %U https://proceedings.mlr.press/v195/liu23a.html %V 195 %X Compared with learning each task independently, multi-task learning (MTL) is able to learn with few training samples and achieves better prediction performance. Recently, Boursier et al. (2022) study the estimation error bound for MTL with trace norm regularizer and a few observations per task. However, their results rely on three assumptions: 1) The features are isotropic; 2) The task diversity assumption is enforced to the parameters matrix; 3) The number of tasks is larger than the features dimension. Whether it is possible to drop these three assumptions and improve the bounds in Boursier et al. (2022) has remained unknown. This paper provides an affirmative answer to this question. Specifically, we reduce their upper bounds from $\tilde{\mathcal{O}}(\sigma \sqrt{\frac{rd^2/m+rT}{m}} + \sqrt{\frac{rd^2/m+rdT/m}{m}})$ to $\mathcal{O}( \sigma\sqrt{\frac{r+rd/T}{m}} )$ without three assumptions, where $T$ is the number of tasks, $d$ is the dimension of the feature space, $m$ is the number of observations per task, $r$ is the rank of ground truth matrix, $\sigma$ is the standard deviation of the noise random variable. Moreover, we provide minimax lower bounds showing our upper bounds are rateoptimal if $T =\mathcal{O}(d)$.
APA
Liu, W.. (2023). Improved Bounds for Multi-task Learning with Trace Norm Regularization. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:700-714 Available from https://proceedings.mlr.press/v195/liu23a.html.

Related Material