Decision making with limited feedback

Danielle Ensign; Frielder Sorelle; Neville Scott; Scheidegger Carlos; Venkatasubramanian Suresh

Decision making with limited feedback

Danielle Ensign, Frielder Sorelle, Neville Scott, Scheidegger Carlos, Venkatasubramanian Suresh

Proceedings of Algorithmic Learning Theory, PMLR 83:359-367, 2018.

Abstract

When models are trained for deployment in decision-making in various real-world settings, they are typically trained in batch mode. Historical data is used to train and validate the models prior to deployment. However, in many settings, \emph{feedback} changes the nature of the training process. Either the learner does not get full feedback on its actions, or the decisions made by the trained model influence what future training data it will see. In this paper, we focus on the problems of recidivism prediction and predictive policing. We present the first algorithms with provable regret for these problems, by showing that both problems (and others like these) can be abstracted into a general reinforcement learning framework called partial monitoring. We also discuss the policy implications of these solutions.

Cite this Paper

BibTeX


@InProceedings{pmlr-v83-ensign18a,
  title = 	 {Decision making with limited feedback},
  author = 	 {Ensign, Danielle and Sorelle, Frielder and Scott, Neville and Carlos, Scheidegger and Suresh, Venkatasubramanian},
  booktitle = 	 {Proceedings of Algorithmic Learning Theory},
  pages = 	 {359--367},
  year = 	 {2018},
  editor = 	 {Janoos, Firdaus and Mohri, Mehryar and Sridharan, Karthik},
  volume = 	 {83},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {07--09 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v83/ensign18a/ensign18a.pdf},
  url = 	 {https://proceedings.mlr.press/v83/ensign18a.html},
  abstract = 	 {When models are trained for deployment in decision-making in various real-world
 settings, they are typically trained in batch mode. Historical data is used to
 train and validate the models prior to deployment. However, in many settings,
 \emph{feedback} changes the nature of the training process. Either the learner
 does not get full feedback on its actions, or the decisions
 made by the trained model influence what future training data it will see.
 In this paper, we
 focus on the problems of recidivism prediction and predictive policing. We
 present the first algorithms with provable regret for these problems, by 
 showing that both problems (and others like these) can be abstracted into a general
 reinforcement learning framework called partial monitoring. We also 
 discuss the policy implications of these solutions.
 }
}

Endnote

%0 Conference Paper
%T Decision making with limited feedback
%A Danielle Ensign
%A Frielder Sorelle
%A Neville Scott
%A Scheidegger Carlos
%A Venkatasubramanian Suresh
%B Proceedings of Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2018
%E Firdaus Janoos
%E Mehryar Mohri
%E Karthik Sridharan	
%F pmlr-v83-ensign18a
%I PMLR
%P 359--367
%U https://proceedings.mlr.press/v83/ensign18a.html
%V 83
%X When models are trained for deployment in decision-making in various real-world
 settings, they are typically trained in batch mode. Historical data is used to
 train and validate the models prior to deployment. However, in many settings,
 \emph{feedback} changes the nature of the training process. Either the learner
 does not get full feedback on its actions, or the decisions
 made by the trained model influence what future training data it will see.
 In this paper, we
 focus on the problems of recidivism prediction and predictive policing. We
 present the first algorithms with provable regret for these problems, by 
 showing that both problems (and others like these) can be abstracted into a general
 reinforcement learning framework called partial monitoring. We also 
 discuss the policy implications of these solutions.

APA


Ensign, D., Sorelle, F., Scott, N., Carlos, S. & Suresh, V.. (2018). Decision making with limited feedback. Proceedings of Algorithmic Learning Theory, in Proceedings of Machine Learning Research 83:359-367 Available from https://proceedings.mlr.press/v83/ensign18a.html.

Related Material

Download PDF