View article

[PDF] from arxiv.org

Reinforcement learning through asynchronous advantage actor-critic on a gpu

Authors

Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz

Publication date

2016/11/18

Journal

arXiv preprint arXiv:1611.06256

Description

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at https://github.com/NVlabs/GA3C .

Total citations

Cited by 327

2017201820192020202120222023202412 24 36 40 59 49 68 37

Scholar articles

Reinforcement learning through asynchronous advantage actor-critic on a gpu

M Babaeizadeh, I Frosio, S Tyree, J Clemons, J Kautz - arXiv preprint arXiv:1611.06256, 2016