View article

Bayesian incentive-compatible bandit exploration

Authors

Yishay Mansour, Aleksandrs Slivkins, Vasilis Syrgkanis

Publication date

2020/7

Journal

Operations Research

Volume

Issue

Pages

1132-1161

Publisher

INFORMS

Description

As self-interested individuals (“agents”) make decisions over time, they utilize information revealed by other agents in the past and produce information that may help agents in the future. This phenomenon is common in a wide range of scenarios in the Internet economy, as well as in medical decisions. Each agent would like to exploit: select the best action given the current information, but would prefer the previous agents to explore: try out various alternatives to collect information. A social planner, by means of a carefully designed recommendation policy, can incentivize the agents to balance the exploration and exploitation so as to maximize social welfare. We model the planner’s recommendation policy as a multiarm bandit algorithm under incentive-compatibility constraints induced by agents’ Bayesian priors. We design a bandit algorithm which is incentive-compatible and has asymptotically optimal …

Total citations

Cited by 159

20152016201720182019202020212022202320242 4 10 20 15 16 16 25 24 27

Scholar articles

Bayesian incentive-compatible bandit exploration*

Y Mansour, A Slivkins, V Syrgkanis - Proceedings of the Sixteenth ACM Conference on …, 2015

Bayesian incentive-compatible bandit exploration

Y Mansour, A Slivkins, V Syrgkanis - Operations Research, 2020