On Randomized Algorithms in Online Strategic Classification

Chase Hutton, Adam Melrod, Han Shao
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:3635-3665, 2026.

Abstract

Online strategic classification studies settings in which agents strategically modify their features to obtain favorable predictions. For example, given a classifier that determines loan approval based on credit scores, applicants may open or close credit cards and bank accounts to obtain a positive prediction. The learning goal is to achieve low mistake or regret bounds despite such strategic behavior. While randomized algorithms have the potential to offer advantages to the learner in strategic settings, they have been largely underexplored. In the realizable setting, no lower bound is known for randomized algorithms, and existing lower bound constructions for deterministic learners can be circumvented by randomization. In the agnostic setting, the best known regret upper bound is $O(T^{3/4}\log^{1/4}(T|\mathcal{H}|))$ due to Ahmadi et al. (2023), which is far from the standard online learning rate of $O(\sqrt{T\log|\mathcal{H}|})$. In this work, we provide refined upper and lower bounds for online strategic classification in both the realizable and agnostic settings; our bounds depend on the Littlestone dimension $\mathrm{Ldim}(\mathcal{H})$ of the hypothesis class $\mathcal{H}$ and the maximum degree $\Delta$ of the manipulation graph. In the realizable setting, using a new construction, we prove a lower bound that, for $T > \mathrm{Ldim}(\mathcal{H}) \Delta^2$, extends the existing deterministic lower bound of $\Omega(\mathrm{Ldim}(\mathcal{H}) \Delta)$ to all algorithms. This is the first lower bound that applies to randomized algorithms, resolving an open question of Ahmadi et al. (2023). We also give the first randomized algorithm that improves on the known deterministic upper bound of $O(\mathrm{Ldim}(\mathcal{H})\cdot\Delta\log\Delta)$, achieving $O(\sqrt{T\cdot\mathrm{Ldim}(\mathcal{H})\log\Delta})$ expected mistakes, in the regime $T < \mathrm{Ldim}(\mathcal{H})\,\Delta^2\log\Delta$. In the agnostic setting, we give an improper randomized algorithm with expected regret $O(\sqrt{T\log|\mathcal{H}|})$ against adaptive adversaries, improving upon the previous $O(T^{3/4}\log^{1/4}(T|\mathcal{H}|))$ bound and matching the standard online learning rate. We also prove that this optimal rate requires improper learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-hutton26a, title = {On Randomized Algorithms in Online Strategic Classification}, author = {Hutton, Chase and Melrod, Adam and Shao, Han}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {3635--3665}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/hutton26a/hutton26a.pdf}, url = {https://proceedings.mlr.press/v336/hutton26a.html}, abstract = {Online strategic classification studies settings in which agents strategically modify their features to obtain favorable predictions. For example, given a classifier that determines loan approval based on credit scores, applicants may open or close credit cards and bank accounts to obtain a positive prediction. The learning goal is to achieve low mistake or regret bounds despite such strategic behavior. While randomized algorithms have the potential to offer advantages to the learner in strategic settings, they have been largely underexplored. In the realizable setting, no lower bound is known for randomized algorithms, and existing lower bound constructions for deterministic learners can be circumvented by randomization. In the agnostic setting, the best known regret upper bound is $O(T^{3/4}\log^{1/4}(T|\mathcal{H}|))$ due to Ahmadi et al. (2023), which is far from the standard online learning rate of $O(\sqrt{T\log|\mathcal{H}|})$. In this work, we provide refined upper and lower bounds for online strategic classification in both the realizable and agnostic settings; our bounds depend on the Littlestone dimension $\mathrm{Ldim}(\mathcal{H})$ of the hypothesis class $\mathcal{H}$ and the maximum degree $\Delta$ of the manipulation graph. In the realizable setting, using a new construction, we prove a lower bound that, for $T > \mathrm{Ldim}(\mathcal{H}) \Delta^2$, extends the existing deterministic lower bound of $\Omega(\mathrm{Ldim}(\mathcal{H}) \Delta)$ to all algorithms. This is the first lower bound that applies to randomized algorithms, resolving an open question of Ahmadi et al. (2023). We also give the first randomized algorithm that improves on the known deterministic upper bound of $O(\mathrm{Ldim}(\mathcal{H})\cdot\Delta\log\Delta)$, achieving $O(\sqrt{T\cdot\mathrm{Ldim}(\mathcal{H})\log\Delta})$ expected mistakes, in the regime $T < \mathrm{Ldim}(\mathcal{H})\,\Delta^2\log\Delta$. In the agnostic setting, we give an improper randomized algorithm with expected regret $O(\sqrt{T\log|\mathcal{H}|})$ against adaptive adversaries, improving upon the previous $O(T^{3/4}\log^{1/4}(T|\mathcal{H}|))$ bound and matching the standard online learning rate. We also prove that this optimal rate requires improper learning.} }
Endnote
%0 Conference Paper %T On Randomized Algorithms in Online Strategic Classification %A Chase Hutton %A Adam Melrod %A Han Shao %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-hutton26a %I PMLR %P 3635--3665 %U https://proceedings.mlr.press/v336/hutton26a.html %V 336 %X Online strategic classification studies settings in which agents strategically modify their features to obtain favorable predictions. For example, given a classifier that determines loan approval based on credit scores, applicants may open or close credit cards and bank accounts to obtain a positive prediction. The learning goal is to achieve low mistake or regret bounds despite such strategic behavior. While randomized algorithms have the potential to offer advantages to the learner in strategic settings, they have been largely underexplored. In the realizable setting, no lower bound is known for randomized algorithms, and existing lower bound constructions for deterministic learners can be circumvented by randomization. In the agnostic setting, the best known regret upper bound is $O(T^{3/4}\log^{1/4}(T|\mathcal{H}|))$ due to Ahmadi et al. (2023), which is far from the standard online learning rate of $O(\sqrt{T\log|\mathcal{H}|})$. In this work, we provide refined upper and lower bounds for online strategic classification in both the realizable and agnostic settings; our bounds depend on the Littlestone dimension $\mathrm{Ldim}(\mathcal{H})$ of the hypothesis class $\mathcal{H}$ and the maximum degree $\Delta$ of the manipulation graph. In the realizable setting, using a new construction, we prove a lower bound that, for $T > \mathrm{Ldim}(\mathcal{H}) \Delta^2$, extends the existing deterministic lower bound of $\Omega(\mathrm{Ldim}(\mathcal{H}) \Delta)$ to all algorithms. This is the first lower bound that applies to randomized algorithms, resolving an open question of Ahmadi et al. (2023). We also give the first randomized algorithm that improves on the known deterministic upper bound of $O(\mathrm{Ldim}(\mathcal{H})\cdot\Delta\log\Delta)$, achieving $O(\sqrt{T\cdot\mathrm{Ldim}(\mathcal{H})\log\Delta})$ expected mistakes, in the regime $T < \mathrm{Ldim}(\mathcal{H})\,\Delta^2\log\Delta$. In the agnostic setting, we give an improper randomized algorithm with expected regret $O(\sqrt{T\log|\mathcal{H}|})$ against adaptive adversaries, improving upon the previous $O(T^{3/4}\log^{1/4}(T|\mathcal{H}|))$ bound and matching the standard online learning rate. We also prove that this optimal rate requires improper learning.
APA
Hutton, C., Melrod, A. & Shao, H.. (2026). On Randomized Algorithms in Online Strategic Classification. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:3635-3665 Available from https://proceedings.mlr.press/v336/hutton26a.html.

Related Material