View article

Authors

Jan Gläscher, Nathaniel Daw, Peter Dayan, John P O'Doherty

Publication date

2010/5/27

Journal

Neuron

Volume

Issue

Pages

585-595

Publisher

Elsevier

Description

Reinforcement learning (RL) uses sequential experience with situations ("states") and outcomes to assess actions. Whereas model-free RL uses this experience directly, in the form of a reward prediction error (RPE), model-based RL uses it indirectly, building a model of the state transition and outcome structure of the environment, and evaluating actions by searching this model. A state prediction error (SPE) plays a central role, reporting discrepancies between the current model and the observed state transitions. Using functional magnetic resonance imaging in humans solving a probabilistic Markov decision task, we found the neural signature of an SPE in the intraparietal sulcus and lateral prefrontal cortex, in addition to the previously well-characterized RPE in the ventral striatum. This finding supports the existence of two unique forms of learning signal in humans, which may form the basis of distinct …

Total citations

Cited by 1378

2010201120122013201420152016201720182019202020212022202320244 48 69 65 79 105 96 113 124 134 135 116 103 110 66

Scholar articles

States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning

J Gläscher, N Daw, P Dayan, JP O'Doherty - Neuron, 2010

Cited by 1378 Related articles All 23 versions