Authors
François Cinotti, Virginie Fresno, Nassim Aklil, Etienne Coutureau, Benoît Girard, Alain R Marchand, Mehdi Khamassi
Publication date
2019/5/1
Journal
Scientific reports
Volume
9
Issue
1
Pages
6770
Publisher
Nature Publishing Group UK
Description
In a volatile environment where rewards are uncertain, successful performance requires a delicate balance between exploitation of the best option and exploration of alternative choices. It has theoretically been proposed that dopamine contributes to the control of this exploration-exploitation trade-off, specifically that the higher the level of tonic dopamine, the more exploitation is favored. We demonstrate here that there is a formal relationship between the rescaling of dopamine positive reward prediction errors and the exploration-exploitation trade-off in simple non-stationary multi-armed bandit tasks. We further show in rats performing such a task that systemically antagonizing dopamine receptors greatly increases the number of random choices without affecting learning capacities. Simulations and comparison of a set of different computational models (an extended Q-learning model, a directed exploration model …
Total citations
20172018201920202021202220232024141324161313
Scholar articles
F Cinotti, V Fresno, N Aklil, E Coutureau, B Girard… - Scientific reports, 2019
F Cinotti, V Fresno, N Aklil, E Coutureau, B Girard… - bioRxiv, 2018