Authors
Jan Gläscher, Nathaniel Daw, Peter Dayan, John P O'Doherty
Publication date
2010/5/27
Journal
Neuron
Volume
66
Issue
4
Pages
585-595
Publisher
Elsevier
Description
Reinforcement learning (RL) uses sequential experience with situations ("states") and outcomes to assess actions. Whereas model-free RL uses this experience directly, in the form of a reward prediction error (RPE), model-based RL uses it indirectly, building a model of the state transition and outcome structure of the environment, and evaluating actions by searching this model. A state prediction error (SPE) plays a central role, reporting discrepancies between the current model and the observed state transitions. Using functional magnetic resonance imaging in humans solving a probabilistic Markov decision task, we found the neural signature of an SPE in the intraparietal sulcus and lateral prefrontal cortex, in addition to the previously well-characterized RPE in the ventral striatum. This finding supports the existence of two unique forms of learning signal in humans, which may form the basis of distinct …
Total citations
2010201120122013201420152016201720182019202020212022202320244486965791059611312413413511610311066