[edit]
Open Problem: Do Good Algorithms Necessarily Query Bad Points?
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:3190-3193, 2019.
Abstract
Folklore results in the theory of Stochastic Approximation indicates the (minimax) optimality of Stochastic Gradient Descent (SGD) (Robbins and Monro, 1951) with polynomially decaying stepsizes and iterate averaging (Ruppert, 1988; Polyak and Juditsky, 1992) for classes of stochastic convex optimization. Basing of these folkore results and some recent developments, this manuscript considers a more subtle question: does any algorithm necessarily (information theoretically) have to query iterates that are sub-optimal infinitely often?