Online A-Optimal Design and Active Linear Regression

Xavier Fontaine; Pierre Perrault; Michal Valko; Vianney Perchet

Online A-Optimal Design and Active Linear Regression

Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3374-3383, 2021.

Abstract

We consider in this paper the problem of optimal experiment design where a decision maker can choose which points to sample to obtain an estimate $\hat{\beta}$ of the hidden parameter $\beta^{\star}$ of an underlying linear model. The key challenge of this work lies in the heteroscedasticity assumption that we make, meaning that each covariate has a different and unknown variance. The goal of the decision maker is then to figure out on the fly the optimal way to allocate the total budget of $T$ samples between covariates, as sampling several times a specific one will reduce the variance of the estimated model around it (but at the cost of a possible higher variance elsewhere). By trying to minimize the $\ell^2$-loss $\mathbb{E} [\lVert\hat{\beta}-\beta^{\star}\rVert^2]$ the decision maker is actually minimizing the trace of the covariance matrix of the problem, which corresponds then to online A-optimal design. Combining techniques from bandit and convex optimization we propose a new active sampling algorithm and we compare it with existing ones. We provide theoretical guarantees of this algorithm in different settings, including a $\mathcal{O}(T^{-2})$ regret bound in the case where the covariates form a basis of the feature space, generalizing and improving existing results. Numerical experiments validate our theoretical findings.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-fontaine21a,
  title = 	 {Online A-Optimal Design and Active Linear Regression},
  author =       {Fontaine, Xavier and Perrault, Pierre and Valko, Michal and Perchet, Vianney},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {3374--3383},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/fontaine21a/fontaine21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/fontaine21a.html},
  abstract = 	 {We consider in this paper the problem of optimal experiment design where a decision maker can choose which points to sample to obtain an estimate $\hat{\beta}$ of the hidden parameter $\beta^{\star}$ of an underlying linear model. The key challenge of this work lies in the heteroscedasticity assumption that we make, meaning that each covariate has a different and unknown variance. The goal of the decision maker is then to figure out on the fly the optimal way to allocate the total budget of $T$ samples between covariates, as sampling several times a specific one will reduce the variance of the estimated model around it (but at the cost of a possible higher variance elsewhere). By trying to minimize the $\ell^2$-loss $\mathbb{E} [\lVert\hat{\beta}-\beta^{\star}\rVert^2]$ the decision maker is actually minimizing the trace of the covariance matrix of the problem, which corresponds then to online A-optimal design. Combining techniques from bandit and convex optimization we propose a new active sampling algorithm and we compare it with existing ones. We provide theoretical guarantees of this algorithm in different settings, including a $\mathcal{O}(T^{-2})$ regret bound in the case where the covariates form a basis of the feature space, generalizing and improving existing results. Numerical experiments validate our theoretical findings.}
}

Endnote

%0 Conference Paper
%T Online A-Optimal Design and Active Linear Regression
%A Xavier Fontaine
%A Pierre Perrault
%A Michal Valko
%A Vianney Perchet
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-fontaine21a
%I PMLR
%P 3374--3383
%U https://proceedings.mlr.press/v139/fontaine21a.html
%V 139
%X We consider in this paper the problem of optimal experiment design where a decision maker can choose which points to sample to obtain an estimate $\hat{\beta}$ of the hidden parameter $\beta^{\star}$ of an underlying linear model. The key challenge of this work lies in the heteroscedasticity assumption that we make, meaning that each covariate has a different and unknown variance. The goal of the decision maker is then to figure out on the fly the optimal way to allocate the total budget of $T$ samples between covariates, as sampling several times a specific one will reduce the variance of the estimated model around it (but at the cost of a possible higher variance elsewhere). By trying to minimize the $\ell^2$-loss $\mathbb{E} [\lVert\hat{\beta}-\beta^{\star}\rVert^2]$ the decision maker is actually minimizing the trace of the covariance matrix of the problem, which corresponds then to online A-optimal design. Combining techniques from bandit and convex optimization we propose a new active sampling algorithm and we compare it with existing ones. We provide theoretical guarantees of this algorithm in different settings, including a $\mathcal{O}(T^{-2})$ regret bound in the case where the covariates form a basis of the feature space, generalizing and improving existing results. Numerical experiments validate our theoretical findings.

APA

Fontaine, X., Perrault, P., Valko, M. & Perchet, V.. (2021). Online A-Optimal Design and Active Linear Regression. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:3374-3383 Available from https://proceedings.mlr.press/v139/fontaine21a.html.

Online A-Optimal Design and Active Linear Regression

Abstract

Cite this Paper

Related Material