View article

[PDF] from acm.org

Bandits with knapsacks

Authors

Ashwinkumar Badanidiyuru, Robert Kleinberg, Aleksandrs Slivkins

Publication date

2018/3/1

Journal

Journal of the ACM (JACM)

Volume

Issue

Pages

Publisher

ACM

Description

Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains, the learner may be constrained by one or more supply (or budget) limits, in addition to the customary limitation on the time horizon. The literature lacks a general model encompassing these sorts of problems. We introduce such a model, called bandits with knapsacks, that combines bandit learning with aspects of stochastic integer programming. In particular, a bandit algorithm needs to solve a stochastic version of the well-known knapsack problem, which is concerned with packing items into a limited-size knapsack. A distinctive feature of our problem, in comparison to the existing regret-minimization literature, is that the optimal policy for a …

Total citations

Cited by 503

2013201420152016201720182019202020212022202320247 17 24 22 27 30 49 53 76 69 77 52

Scholar articles

Bandits with knapsacks

A Badanidiyuru, R Kleinberg, A Slivkins - Journal of the ACM (JACM), 2018

Bandits with knapsacks: Dynamic procurement for crowdsourcing*

A Badanidiyuru, R Kleinberg, A Slivkins - The 3rd Workshop on Social Computing and User …, 2013