Refined bounds for algorithm configuration: The knife-edge of dual class approximability

Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:580-590, 2020.

Abstract

Automating algorithm configuration is growing increasingly necessary as algorithms come with more and more tunable parameters. It is common to tune parameters using machine learning, optimizing algorithmic performance (runtime or solution quality, for example) using a training set of problem instances from the specific domain at hand. We investigate a fundamental question about these techniques: how large should the training set be to ensure that a parameter’s average empirical performance over the training set is close to its expected, future performance? We answer this question for algorithm configuration problems that exhibit a widely-applicable structure: the algorithm’s performance as a function of its parameters can be approximated by a “simple” function. We show that if this approximation holds under the L$\infty$-norm, we can provide strong sample complexity bounds, but if the approximation holds only under the Lp-norm for p < $\infty$, it is not possible to provide meaningful sample complexity bounds in the worst case. We empirically evaluate our bounds in the context of integer programming, obtaining sample complexity bounds that are up to 700 times smaller than the previously best-known bounds.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-balcan20a, title = {Refined bounds for algorithm configuration: The knife-edge of dual class approximability}, author = {Balcan, Maria-Florina and Sandholm, Tuomas and Vitercik, Ellen}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {580--590}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/balcan20a/balcan20a.pdf}, url = {https://proceedings.mlr.press/v119/balcan20a.html}, abstract = {Automating algorithm configuration is growing increasingly necessary as algorithms come with more and more tunable parameters. It is common to tune parameters using machine learning, optimizing algorithmic performance (runtime or solution quality, for example) using a training set of problem instances from the specific domain at hand. We investigate a fundamental question about these techniques: how large should the training set be to ensure that a parameter’s average empirical performance over the training set is close to its expected, future performance? We answer this question for algorithm configuration problems that exhibit a widely-applicable structure: the algorithm’s performance as a function of its parameters can be approximated by a “simple” function. We show that if this approximation holds under the L$\infty$-norm, we can provide strong sample complexity bounds, but if the approximation holds only under the Lp-norm for p < $\infty$, it is not possible to provide meaningful sample complexity bounds in the worst case. We empirically evaluate our bounds in the context of integer programming, obtaining sample complexity bounds that are up to 700 times smaller than the previously best-known bounds.} }
Endnote
%0 Conference Paper %T Refined bounds for algorithm configuration: The knife-edge of dual class approximability %A Maria-Florina Balcan %A Tuomas Sandholm %A Ellen Vitercik %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-balcan20a %I PMLR %P 580--590 %U https://proceedings.mlr.press/v119/balcan20a.html %V 119 %X Automating algorithm configuration is growing increasingly necessary as algorithms come with more and more tunable parameters. It is common to tune parameters using machine learning, optimizing algorithmic performance (runtime or solution quality, for example) using a training set of problem instances from the specific domain at hand. We investigate a fundamental question about these techniques: how large should the training set be to ensure that a parameter’s average empirical performance over the training set is close to its expected, future performance? We answer this question for algorithm configuration problems that exhibit a widely-applicable structure: the algorithm’s performance as a function of its parameters can be approximated by a “simple” function. We show that if this approximation holds under the L$\infty$-norm, we can provide strong sample complexity bounds, but if the approximation holds only under the Lp-norm for p < $\infty$, it is not possible to provide meaningful sample complexity bounds in the worst case. We empirically evaluate our bounds in the context of integer programming, obtaining sample complexity bounds that are up to 700 times smaller than the previously best-known bounds.
APA
Balcan, M., Sandholm, T. & Vitercik, E.. (2020). Refined bounds for algorithm configuration: The knife-edge of dual class approximability. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:580-590 Available from https://proceedings.mlr.press/v119/balcan20a.html.

Related Material