Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization

Matan Schliserman; Tomer Koren

Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization

Matan Schliserman, Tomer Koren

Proceedings of The 37th International Conference on Algorithmic Learning Theory, PMLR 313:1-19, 2026.

Abstract

We study the problem of learning vector-valued linear predictors: these are prediction rules parameterized by a matrix that maps an $m$-dimensional feature vector to a $k$-dimensional target. We focus on the fundamental case with a convex and Lipschitz loss function, and show several new theoretical results that shed light on the complexity of this problem and its connection to related learning models. First, we give a tight characterization of the sample complexity of Empirical Risk Minimization (ERM) in this setting, establishing that $\widetilde{\Omega}(k/\varepsilon^2)$ examples are necessary for ERM to reach $\epsilon$ excess (population) risk; this provides for an exponential improvement over recent results by Magen and Shamir (2024) in terms of the dependence on the target dimension $k$, and matches a classical upper bound due to Maurer (2016). Second, we present a black-box conversion from general $d$-dimensional Stochastic Convex Optimization (SCO) to vector-valued linear prediction, showing that any SCO problem can be embedded as a prediction problem with $k=\Theta(d)$ outputs. These results portray the setting of vector-valued linear prediction as bridging between two extensively studied yet disparate learning models: linear models (corresponds to $k=1$) and general $d$-dimensional SCO (with $k=\Theta(d)$).

Cite this Paper

BibTeX

@InProceedings{pmlr-v313-schliserman26a,
  title = 	 {Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization},
  author =       {Schliserman, Matan and Koren, Tomer},
  booktitle = 	 {Proceedings of The 37th International Conference on Algorithmic Learning Theory},
  pages = 	 {1--19},
  year = 	 {2026},
  editor = 	 {Telgarsky, Matus and Ullman, Jonathan},
  volume = 	 {313},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--26 Feb},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v313/main/assets/schliserman26a/schliserman26a.pdf},
  url = 	 {https://proceedings.mlr.press/v313/schliserman26a.html},
  abstract = 	 {We study the problem of learning vector-valued linear predictors: these are prediction rules parameterized by a matrix that maps an $m$-dimensional feature vector to a $k$-dimensional target. We focus on the fundamental case with a convex and Lipschitz loss function, and show several new theoretical results that shed light on the complexity of this problem and its connection to related learning models. First, we give a tight characterization of the sample complexity of Empirical Risk Minimization (ERM) in this setting, establishing that $\widetilde{\Omega}(k/\varepsilon^2)$ examples are necessary for ERM to reach $\epsilon$ excess (population) risk; this provides for an exponential improvement over recent results by Magen and Shamir (2024) in terms of the dependence on the target dimension $k$, and matches a classical upper bound due to Maurer (2016). Second, we present a black-box conversion from general $d$-dimensional Stochastic Convex Optimization (SCO) to vector-valued linear prediction, showing that any SCO problem can be embedded as a prediction problem with $k=\Theta(d)$ outputs. These results portray the setting of vector-valued linear prediction as bridging between two extensively studied yet disparate learning models: linear models (corresponds to $k=1$) and general $d$-dimensional SCO (with $k=\Theta(d)$).}
}

Endnote

%0 Conference Paper
%T Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization
%A Matan Schliserman
%A Tomer Koren
%B Proceedings of The 37th International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Matus Telgarsky
%E Jonathan Ullman	
%F pmlr-v313-schliserman26a
%I PMLR
%P 1--19
%U https://proceedings.mlr.press/v313/schliserman26a.html
%V 313
%X We study the problem of learning vector-valued linear predictors: these are prediction rules parameterized by a matrix that maps an $m$-dimensional feature vector to a $k$-dimensional target. We focus on the fundamental case with a convex and Lipschitz loss function, and show several new theoretical results that shed light on the complexity of this problem and its connection to related learning models. First, we give a tight characterization of the sample complexity of Empirical Risk Minimization (ERM) in this setting, establishing that $\widetilde{\Omega}(k/\varepsilon^2)$ examples are necessary for ERM to reach $\epsilon$ excess (population) risk; this provides for an exponential improvement over recent results by Magen and Shamir (2024) in terms of the dependence on the target dimension $k$, and matches a classical upper bound due to Maurer (2016). Second, we present a black-box conversion from general $d$-dimensional Stochastic Convex Optimization (SCO) to vector-valued linear prediction, showing that any SCO problem can be embedded as a prediction problem with $k=\Theta(d)$ outputs. These results portray the setting of vector-valued linear prediction as bridging between two extensively studied yet disparate learning models: linear models (corresponds to $k=1$) and general $d$-dimensional SCO (with $k=\Theta(d)$).

APA

Schliserman, M. & Koren, T.. (2026). Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization. Proceedings of The 37th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 313:1-19 Available from https://proceedings.mlr.press/v313/schliserman26a.html.

Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization

Abstract

Cite this Paper

Related Material