View article

[PDF] from aaai.org

Mega-reward: Achieving human-level play without extrinsic rewards

Authors

Yuhang Song, Jianyi Wang, Thomas Lukasiewicz, Zhenghua Xu, Shangtong Zhang, Andrzej Wojcicki, Mai Xu

Publication date

2020/4/3

Journal

Proceedings of the AAAI Conference on Artificial Intelligence

Volume

Issue

Pages

5826-5833

Description

Intrinsic rewards were introduced to simulate how human intelligence works; they are usually evaluated by intrinsically-motivated play, ie, playing games without extrinsic rewards but evaluated with extrinsic rewards. However, none of the existing intrinsic reward approaches can achieve human-level performance under this very challenging setting of intrinsically-motivated play. In this work, we propose a novel megalomania-driven intrinsic reward (called mega-reward), which, to our knowledge, is the first approach that achieves human-level performance in intrinsically-motivated play. Intuitively, mega-reward comes from the observation that infants' intelligence develops when they try to gain more control on entities in an environment; therefore, mega-reward aims to maximize the control capabilities of agents on given entities in a given environment. To formalize mega-reward, a relational transition model is proposed to bridge the gaps between direct and latent control. Experimental studies show that mega-reward (i) can greatly outperform all state-of-the-art intrinsic reward approaches,(ii) generally achieves the same level of performance as Ex-PPO and professional human-level scores, and (iii) has also a superior performance when it is incorporated with extrinsic rewards.

Total citations

Cited by 14

2019202020212022202320241 3 5 1 3 1

Scholar articles

Mega-reward: Achieving human-level play without extrinsic rewards

Y Song, J Wang, T Lukasiewicz, Z Xu, S Zhang… - Proceedings of the AAAI Conference on Artificial …, 2020