View article

[PDF] from ox.ac.uk

Protecting against evaluation overfitting in empirical reinforcement learning

Authors

Shimon Whiteson, Brian Tanner, Matthew E Taylor, Peter Stone

Publication date

2011/4/11

Conference

2011 IEEE symposium on adaptive dynamic programming and reinforcement learning (ADPRL)

Pages

120-127

Publisher

IEEE

Description

Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed. Designing good empirical methodologies is difficult in part because agents can overfit test evaluations and thereby obtain misleadingly high scores. We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled from a distribution. In addition, we consider how to summarize performance when scores from different environments may not have commensurate values. Finally, we present proof-of-concept results demonstrating how these methodologies can validate an intuitively useful range-adaptive tile coding method.

Total citations

Cited by 144

201120122013201420152016201720182019202020212022202320246 1 3 6 4 4 3 14 16 17 20 25 19 6

Scholar articles

Protecting against evaluation overfitting in empirical reinforcement learning

S Whiteson, B Tanner, ME Taylor, P Stone - 2011 IEEE symposium on adaptive dynamic …, 2011