[edit]
Learning from Equivalence Queries, Revisited
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:808-836, 2026.
Abstract
Modern machine learning systems, such as generative models and recommendation systems, often evolve through a cycle of deploying a model, observing user interactions, and updating the model intermittently based on feedback. This mode of learning contrasts with common supervised learning frameworks, which focus on loss or regret minimization over a shared sequence of prediction tasks. Motivated by this deployment-driven learning cycle, we revisit the classical model of learning from equivalence queries, introduced by Angluin, which provides a simple abstraction of such interactions: a learner repeatedly proposes hypotheses and, whenever the deployed hypothesis is inadequate, receives a counterexample tailored to that hypothesis. Under fully adversarial counterexample generation, however, this model exhibits overly pessimistic worst-case behavior. Moreover, most existing work on learning from equivalence queries considers the \emph{full-information} setting, where the learner observes not only a counterexample but also its correct label. This is an assumption that does not always align with natural interactive settings. To address these considerations, we restrict the environment to generate counterexamples in a less adversarial manner by introducing a broad class of counterexample generators, which we call \emph{symmetric}. Informally, such symmetric counterexample generators select counterexamples based only on the symmetric difference between the hypothesis and the target, and encompass natural feedback mechanisms such as random counterexamples, as well as generators that select counterexamples minimizing a prescribed complexity measure over the instance space. Within this framework, we study learning from equivalence queries under both full-information and bandit feedback. We establish tight bounds on the number of learning rounds in both settings and outline directions for future research. Our techniques rely on a game-theoretic perspective on symmetric adversaries and combine adaptive weighting algorithms with minimax arguments.