View article

[PDF] from iospress.nl

Reinforcement Learning With Reward Machines in Stochastic Games

Authors

Jueming Hu, Jean-Raphaël Gaglione, Yanze Wang, Zhe Xu, Ufuk Topcu, Yongming Liu

Publication date

2023

Conference

ECAI 2023

Pages

1068-1075

Description

We investigate multi-agent reinforcement learning for stochastic games with complex tasks, where the reward functions are non-Markovian. We utilize reward machines to incorporate high-level knowledge of complex tasks. We develop an algorithm called Q-learning with reward machines for stochastic games (QRM-SG), to learn the best-response strategy at Nash equilibrium for each agent. In QRM-SG, we define the Q-function at a Nash equilibrium in augmented state space. The augmented state space integrates the state of the stochastic game and the state of reward machines. Each agent learns the Q-functions of all agents in the system. We prove that Q-functions learned in QRM-SG converge to the Q-functions at a Nash equilibrium if the stage game at each time step during learning has a global optimum point or a saddle point, and the agents update Q-functions based on the best-response strategy at this …

Total citations

Cited by 1

20241

Scholar articles

Reinforcement Learning with Reward Machines in Stochastic Games

J Hu, JR Gaglione, Y Wang, Z Xu, U Topcu, Y Liu - ECAI 2023, 2023