No-Regret and Incentive-Compatible Online Learning

Rupert Freeman, David Pennock, Chara Podimata, Jennifer Wortman Vaughan
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:3270-3279, 2020.

Abstract

We study online learning settings in which experts act strategically to maximize their influence on the learning algorithm’s predictions by potentially misreporting their beliefs about a sequence of binary events. Our goal is twofold. First, we want the learning algorithm to be no-regret with respect to the best-fixed expert in hindsight. Second, we want incentive compatibility, a guarantee that each expert’s best strategy is to report his true beliefs about the realization of each event. To achieve this goal, we build on the literature on wagering mechanisms, a type of multi-agent scoring rule. We provide algorithms that achieve no regret and incentive compatibility for myopic experts for both the full and partial information settings. In experiments on datasets from FiveThirtyEight, our algorithms have regret comparable to classic no-regret algorithms, which are not incentive-compatible. Finally, we identify an incentive-compatible algorithm for forward-looking strategic agents that exhibits diminishing regret in practice.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-freeman20a, title = {No-Regret and Incentive-Compatible Online Learning}, author = {Freeman, Rupert and Pennock, David and Podimata, Chara and Vaughan, Jennifer Wortman}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {3270--3279}, year = {2020}, editor = {Hal Daumé III and Aarti Singh}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/freeman20a/freeman20a.pdf}, url = { http://proceedings.mlr.press/v119/freeman20a.html }, abstract = {We study online learning settings in which experts act strategically to maximize their influence on the learning algorithm’s predictions by potentially misreporting their beliefs about a sequence of binary events. Our goal is twofold. First, we want the learning algorithm to be no-regret with respect to the best-fixed expert in hindsight. Second, we want incentive compatibility, a guarantee that each expert’s best strategy is to report his true beliefs about the realization of each event. To achieve this goal, we build on the literature on wagering mechanisms, a type of multi-agent scoring rule. We provide algorithms that achieve no regret and incentive compatibility for myopic experts for both the full and partial information settings. In experiments on datasets from FiveThirtyEight, our algorithms have regret comparable to classic no-regret algorithms, which are not incentive-compatible. Finally, we identify an incentive-compatible algorithm for forward-looking strategic agents that exhibits diminishing regret in practice.} }
Endnote
%0 Conference Paper %T No-Regret and Incentive-Compatible Online Learning %A Rupert Freeman %A David Pennock %A Chara Podimata %A Jennifer Wortman Vaughan %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-freeman20a %I PMLR %P 3270--3279 %U http://proceedings.mlr.press/v119/freeman20a.html %V 119 %X We study online learning settings in which experts act strategically to maximize their influence on the learning algorithm’s predictions by potentially misreporting their beliefs about a sequence of binary events. Our goal is twofold. First, we want the learning algorithm to be no-regret with respect to the best-fixed expert in hindsight. Second, we want incentive compatibility, a guarantee that each expert’s best strategy is to report his true beliefs about the realization of each event. To achieve this goal, we build on the literature on wagering mechanisms, a type of multi-agent scoring rule. We provide algorithms that achieve no regret and incentive compatibility for myopic experts for both the full and partial information settings. In experiments on datasets from FiveThirtyEight, our algorithms have regret comparable to classic no-regret algorithms, which are not incentive-compatible. Finally, we identify an incentive-compatible algorithm for forward-looking strategic agents that exhibits diminishing regret in practice.
APA
Freeman, R., Pennock, D., Podimata, C. & Vaughan, J.W.. (2020). No-Regret and Incentive-Compatible Online Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:3270-3279 Available from http://proceedings.mlr.press/v119/freeman20a.html .

Related Material