Authors
Zihan Ding, Yanhua Huang, Hang Yuan, Hao Dong
Publication date
2020
Journal
Deep reinforcement learning: fundamentals, research and applications
Pages
47-123
Publisher
Springer Singapore
Description
In this chapter, we introduce the fundamentals of classical reinforcement learning and provide a general overview of deep reinforcement learning. We first start with the basic definitions and concepts of reinforcement learning, including the agent, environment, action, and state, as well as the reward function. Then, we describe a classical reinforcement learning problem, the bandit problem, to provide the readers with a basic understanding of the underlying mechanism of traditional reinforcement learning. Next, we introduce the Markov process, together with the Markov reward process and the Markov decision process. These notions are the cornerstones in formulating reinforcement learning tasks. The combination of the Markov reward process and value function estimation produces the core results used in most reinforcement learning methods: the Bellman equations. The optimal value functions and optimal policy …
Total citations
2001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320241111231113111213182731
Scholar articles
Z Ding, Y Huang, H Yuan, H Dong - Deep reinforcement learning: fundamentals, research …, 2020