Authors
David B Brown, James E Smith, Peng Sun
Publication date
2010/8
Journal
Operations research
Volume
58
Issue
4-part-1
Pages
785-801
Publisher
INFORMS
Description
We describe a general technique for determining upper bounds on maximal values (or lower bounds on minimal costs) in stochastic dynamic programs. In this approach, we relax the nonanticipativity constraints that require decisions to depend only on the information available at the time a decision is made and impose a “penalty” that punishes violations of nonanticipativity. In applications, the hope is that this relaxed version of the problem will be simpler to solve than the original dynamic program. The upper bounds provided by this dual approach complement lower bounds on values that may be found by simulating with heuristic policies. We describe the theory underlying this dual approach and establish weak duality, strong duality, and complementary slackness results that are analogous to the duality results of linear programming. We also study properties of good penalties. Finally, we demonstrate the use of …
Total citations
20102011201220132014201520162017201820192020202120222023202431315172122292422212917212212
Scholar articles