Deterministic Apple Tasting

Zachary Chase, Idan Mehalel
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:888-923, 2025.

Abstract

In binary ($0/1$) online classification with apple tasting feedback, the learner receives feedback only when predicting $1$. Besides some degenerate learning tasks, all previously known learning algorithms for this model are randomized. Consequently, prior to this work it was unknown whether deterministic apple tasting is generally feasible. In this work, we provide the first widely-applicable deterministic apple tasting learner, and show that in the realizable case, a hypothesis class is learnable if and only if it is deterministically learnable, confirming a conjecture of Raman, Subedi, Raman, Tewari-24. Quantitatively, we show that every class H is learnable with mistake bound $O (\sqrt{L(H) T log T})$ (where $L(H)$ is the Littlestone dimension of $H$), and that this is tight for some classes. This demonstrates a separation between a deterministic and randomized learner, where the latter can learn every class with mistake bound $O(\sqrt{L(H)T})$, as shown in Raman et al.-24. We further study the agnostic case, in which the best hypothesis makes at most $k$ many mistakes, and prove a trichotomy stating that every class $H$ must be either easy, hard, or unlearnable. Easy classes have (both randomized and deterministic) mistake bound $\Theta_{H}(k)$. Hard classes have randomized mistake bound $\tilde{\Theta}_{H}(k + \sqrt{T})$, and deterministic mistake bound $\tilde{\Theta}_{H}(\sqrt{k \cdot T})$, where $T$ is the time horizon. Unlearnable classes have (both randomized and deterministic) mistake bound $\Theta(T)$. Our upper bound is based on a deterministic algorithm for learning from expert advice with apple tasting feedback, a problem interesting in its own right. For this problem, we show that the optimal deterministic mistake bound is $\Theta (\sqrt{T (k + \log n)})$ for all $k$ and $T \leq n \leq 2^T$, where $n$ is the number of experts. Our algorithm is a natural variation of the well-known exponential weights forecaster.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-chase25a, title = {Deterministic Apple Tasting}, author = {Chase, Zachary and Mehalel, Idan}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {888--923}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/chase25a/chase25a.pdf}, url = {https://proceedings.mlr.press/v291/chase25a.html}, abstract = {In binary ($0/1$) online classification with apple tasting feedback, the learner receives feedback only when predicting $1$. Besides some degenerate learning tasks, all previously known learning algorithms for this model are randomized. Consequently, prior to this work it was unknown whether deterministic apple tasting is generally feasible. In this work, we provide the first widely-applicable deterministic apple tasting learner, and show that in the realizable case, a hypothesis class is learnable if and only if it is deterministically learnable, confirming a conjecture of Raman, Subedi, Raman, Tewari-24. Quantitatively, we show that every class H is learnable with mistake bound $O (\sqrt{L(H) T log T})$ (where $L(H)$ is the Littlestone dimension of $H$), and that this is tight for some classes. This demonstrates a separation between a deterministic and randomized learner, where the latter can learn every class with mistake bound $O(\sqrt{L(H)T})$, as shown in Raman et al.-24. We further study the agnostic case, in which the best hypothesis makes at most $k$ many mistakes, and prove a trichotomy stating that every class $H$ must be either easy, hard, or unlearnable. Easy classes have (both randomized and deterministic) mistake bound $\Theta_{H}(k)$. Hard classes have randomized mistake bound $\tilde{\Theta}_{H}(k + \sqrt{T})$, and deterministic mistake bound $\tilde{\Theta}_{H}(\sqrt{k \cdot T})$, where $T$ is the time horizon. Unlearnable classes have (both randomized and deterministic) mistake bound $\Theta(T)$. Our upper bound is based on a deterministic algorithm for learning from expert advice with apple tasting feedback, a problem interesting in its own right. For this problem, we show that the optimal deterministic mistake bound is $\Theta (\sqrt{T (k + \log n)})$ for all $k$ and $T \leq n \leq 2^T$, where $n$ is the number of experts. Our algorithm is a natural variation of the well-known exponential weights forecaster.} }
Endnote
%0 Conference Paper %T Deterministic Apple Tasting %A Zachary Chase %A Idan Mehalel %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-chase25a %I PMLR %P 888--923 %U https://proceedings.mlr.press/v291/chase25a.html %V 291 %X In binary ($0/1$) online classification with apple tasting feedback, the learner receives feedback only when predicting $1$. Besides some degenerate learning tasks, all previously known learning algorithms for this model are randomized. Consequently, prior to this work it was unknown whether deterministic apple tasting is generally feasible. In this work, we provide the first widely-applicable deterministic apple tasting learner, and show that in the realizable case, a hypothesis class is learnable if and only if it is deterministically learnable, confirming a conjecture of Raman, Subedi, Raman, Tewari-24. Quantitatively, we show that every class H is learnable with mistake bound $O (\sqrt{L(H) T log T})$ (where $L(H)$ is the Littlestone dimension of $H$), and that this is tight for some classes. This demonstrates a separation between a deterministic and randomized learner, where the latter can learn every class with mistake bound $O(\sqrt{L(H)T})$, as shown in Raman et al.-24. We further study the agnostic case, in which the best hypothesis makes at most $k$ many mistakes, and prove a trichotomy stating that every class $H$ must be either easy, hard, or unlearnable. Easy classes have (both randomized and deterministic) mistake bound $\Theta_{H}(k)$. Hard classes have randomized mistake bound $\tilde{\Theta}_{H}(k + \sqrt{T})$, and deterministic mistake bound $\tilde{\Theta}_{H}(\sqrt{k \cdot T})$, where $T$ is the time horizon. Unlearnable classes have (both randomized and deterministic) mistake bound $\Theta(T)$. Our upper bound is based on a deterministic algorithm for learning from expert advice with apple tasting feedback, a problem interesting in its own right. For this problem, we show that the optimal deterministic mistake bound is $\Theta (\sqrt{T (k + \log n)})$ for all $k$ and $T \leq n \leq 2^T$, where $n$ is the number of experts. Our algorithm is a natural variation of the well-known exponential weights forecaster.
APA
Chase, Z. & Mehalel, I.. (2025). Deterministic Apple Tasting. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:888-923 Available from https://proceedings.mlr.press/v291/chase25a.html.

Related Material