Bounded Rationality in Las Vegas: Probabilistic Finite Automata Play Multi-Armed Bandits

Xinming Liu, Joseph Halpern
Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR 124:1298-1307, 2020.

Abstract

While traditional economics assumes that humans are fully rational agents who always maximize their expected utility, in practice, we constantly observe apparently irrational behavior. One explanation is that people have limited computational power, so that they are, quite rationally, making the best decisions they can, given their computational limitations. To test this hypothesis, we consider the multi-armed bandit (MAB) problem. We examine a simple strategy for playing an MAB that can be implemented easily by a probabilistic finite automaton (PFA). Roughly speaking, the PFA sets certain expectations, and plays an arm as long as it meets them. If the PFA has sufficiently many states, it performs near-optimally. Its performance degrades gracefully as the number of states decreases. Moreover, the PFA acts in a "human-like" way, exhibiting a number of standard human biases, like an optimism bias and a negativity bias.

Cite this Paper


BibTeX
@InProceedings{pmlr-v124-liu20c, title = {Bounded Rationality in Las Vegas: Probabilistic Finite Automata Play Multi-Armed Bandits}, author = {Liu, Xinming and Halpern, Joseph}, booktitle = {Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)}, pages = {1298--1307}, year = {2020}, editor = {Peters, Jonas and Sontag, David}, volume = {124}, series = {Proceedings of Machine Learning Research}, month = {03--06 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v124/liu20c/liu20c.pdf}, url = {https://proceedings.mlr.press/v124/liu20c.html}, abstract = {While traditional economics assumes that humans are fully rational agents who always maximize their expected utility, in practice, we constantly observe apparently irrational behavior. One explanation is that people have limited computational power, so that they are, quite rationally, making the best decisions they can, given their computational limitations. To test this hypothesis, we consider the multi-armed bandit (MAB) problem. We examine a simple strategy for playing an MAB that can be implemented easily by a probabilistic finite automaton (PFA). Roughly speaking, the PFA sets certain expectations, and plays an arm as long as it meets them. If the PFA has sufficiently many states, it performs near-optimally. Its performance degrades gracefully as the number of states decreases. Moreover, the PFA acts in a "human-like" way, exhibiting a number of standard human biases, like an optimism bias and a negativity bias.} }
Endnote
%0 Conference Paper %T Bounded Rationality in Las Vegas: Probabilistic Finite Automata Play Multi-Armed Bandits %A Xinming Liu %A Joseph Halpern %B Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI) %C Proceedings of Machine Learning Research %D 2020 %E Jonas Peters %E David Sontag %F pmlr-v124-liu20c %I PMLR %P 1298--1307 %U https://proceedings.mlr.press/v124/liu20c.html %V 124 %X While traditional economics assumes that humans are fully rational agents who always maximize their expected utility, in practice, we constantly observe apparently irrational behavior. One explanation is that people have limited computational power, so that they are, quite rationally, making the best decisions they can, given their computational limitations. To test this hypothesis, we consider the multi-armed bandit (MAB) problem. We examine a simple strategy for playing an MAB that can be implemented easily by a probabilistic finite automaton (PFA). Roughly speaking, the PFA sets certain expectations, and plays an arm as long as it meets them. If the PFA has sufficiently many states, it performs near-optimally. Its performance degrades gracefully as the number of states decreases. Moreover, the PFA acts in a "human-like" way, exhibiting a number of standard human biases, like an optimism bias and a negativity bias.
APA
Liu, X. & Halpern, J.. (2020). Bounded Rationality in Las Vegas: Probabilistic Finite Automata Play Multi-Armed Bandits. Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), in Proceedings of Machine Learning Research 124:1298-1307 Available from https://proceedings.mlr.press/v124/liu20c.html.

Related Material