Regret Minimization with Performative Feedback

Meena Jagadeesan, Tijana Zrnic, Celestine Mendler-Dünner
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:9760-9785, 2022.

Abstract

In performative prediction, the deployment of a predictive model triggers a shift in the data distribution. As these shifts are typically unknown ahead of time, the learner needs to deploy a model to get feedback about the distribution it induces. We study the problem of finding near-optimal models under performativity while maintaining low regret. On the surface, this problem might seem equivalent to a bandit problem. However, it exhibits a fundamentally richer feedback structure that we refer to as performative feedback: after every deployment, the learner receives samples from the shifted distribution rather than bandit feedback about the reward. Our main contribution is regret bounds that scale only with the complexity of the distribution shifts and not that of the reward function. The key algorithmic idea is careful exploration of the distribution shifts that informs a novel construction of confidence bounds on the risk of unexplored models. The construction only relies on smoothness of the shifts and does not assume convexity. More broadly, our work establishes a conceptual approach for leveraging tools from the bandits literature for the purpose of regret minimization with performative feedback.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-jagadeesan22a, title = {Regret Minimization with Performative Feedback}, author = {Jagadeesan, Meena and Zrnic, Tijana and Mendler-D{\"u}nner, Celestine}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {9760--9785}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/jagadeesan22a/jagadeesan22a.pdf}, url = {https://proceedings.mlr.press/v162/jagadeesan22a.html}, abstract = {In performative prediction, the deployment of a predictive model triggers a shift in the data distribution. As these shifts are typically unknown ahead of time, the learner needs to deploy a model to get feedback about the distribution it induces. We study the problem of finding near-optimal models under performativity while maintaining low regret. On the surface, this problem might seem equivalent to a bandit problem. However, it exhibits a fundamentally richer feedback structure that we refer to as performative feedback: after every deployment, the learner receives samples from the shifted distribution rather than bandit feedback about the reward. Our main contribution is regret bounds that scale only with the complexity of the distribution shifts and not that of the reward function. The key algorithmic idea is careful exploration of the distribution shifts that informs a novel construction of confidence bounds on the risk of unexplored models. The construction only relies on smoothness of the shifts and does not assume convexity. More broadly, our work establishes a conceptual approach for leveraging tools from the bandits literature for the purpose of regret minimization with performative feedback.} }
Endnote
%0 Conference Paper %T Regret Minimization with Performative Feedback %A Meena Jagadeesan %A Tijana Zrnic %A Celestine Mendler-Dünner %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-jagadeesan22a %I PMLR %P 9760--9785 %U https://proceedings.mlr.press/v162/jagadeesan22a.html %V 162 %X In performative prediction, the deployment of a predictive model triggers a shift in the data distribution. As these shifts are typically unknown ahead of time, the learner needs to deploy a model to get feedback about the distribution it induces. We study the problem of finding near-optimal models under performativity while maintaining low regret. On the surface, this problem might seem equivalent to a bandit problem. However, it exhibits a fundamentally richer feedback structure that we refer to as performative feedback: after every deployment, the learner receives samples from the shifted distribution rather than bandit feedback about the reward. Our main contribution is regret bounds that scale only with the complexity of the distribution shifts and not that of the reward function. The key algorithmic idea is careful exploration of the distribution shifts that informs a novel construction of confidence bounds on the risk of unexplored models. The construction only relies on smoothness of the shifts and does not assume convexity. More broadly, our work establishes a conceptual approach for leveraging tools from the bandits literature for the purpose of regret minimization with performative feedback.
APA
Jagadeesan, M., Zrnic, T. & Mendler-Dünner, C.. (2022). Regret Minimization with Performative Feedback. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:9760-9785 Available from https://proceedings.mlr.press/v162/jagadeesan22a.html.

Related Material