View article

[PDF] from mlr.press

Full gradient deep reinforcement learning for average-reward criterion

Authors

Tejas Pagare, Vivek Borkar, Konstantin Avrachenkov

Publication date

2023/6/6

Conference

Learning for Dynamics and Control Conference

Pages

235-247

Publisher

PMLR

Description

We extend the provably convergent Full Gradient DQN algorithm for discounted reward Markov decision processes from Avrachenkov et al.(2021) to average reward problems. We experimentally compare widely used RVI Q-Learning with recently proposed Differential Q-Learning in the neural function approximation setting with Full Gradient DQN and DQN. We also extend this to learn Whittle indices for Markovian restless multi-armed bandits. We observe a better convergence rate of the proposed Full Gradient variant across different tasks.

Total citations

Cited by 3

202320241 2

Scholar articles

Full gradient deep reinforcement learning for average-reward criterion

T Pagare, V Borkar, K Avrachenkov - Learning for Dynamics and Control Conference, 2023