Authors
David B Brown, James E Smith
Publication date
2013/6
Journal
Operations research
Volume
61
Issue
3
Pages
644-665
Publisher
INFORMS
Description
This paper was motivated by the problem of developing an optimal policy for exploring an oil and gas field in the North Sea. Where should we drill first? Where do we drill next? In this and many other problems, we face a trade-off between earning (e.g., drilling immediately at the sites with maximal expected values) and learning (e.g., drilling at sites that provide valuable information) that may lead to greater earnings in the future. These “sequential exploration problems” resemble a multiarmed bandit problem, but probabilistic dependence plays a key role: outcomes at drilled sites reveal information about neighboring targets. Good exploration policies will take advantage of this information as it is revealed. We develop heuristic policies for sequential exploration problems and complement these heuristics with upper bounds on the performance of an optimal policy. We begin by grouping the targets into clusters of …
Total citations
201320142015201620172018201920202021202220232024528957387652