View article

[PDF] from pku.edu.cn

Spatio-temporal attention-based LSTM networks for 3D action recognition and detection

Authors

Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng, Jiaying Liu

Publication date

2018/3/22

Journal

IEEE Transactions on image processing

Volume

Issue

Pages

3459-3471

Publisher

IEEE

Description

Human action analytics has attracted a lot of attention for decades in computer vision. It is important to extract discriminative spatio-temporal features to model the spatial and temporal evolutions of different actions. In this paper, we propose a spatial and temporal attention model to explore the spatial and temporal discriminative features for human action recognition and detection from skeleton data. We build our networks based on the recurrent neural networks with long short-term memory units. The learned model is capable of selectively focusing on discriminative joints of skeletons within each input frame and paying different levels of attention to the outputs of different frames. To ensure effective training of the network for action recognition, we propose a regularized cross-entropy loss to drive the learning process and develop a joint training strategy accordingly. Moreover, based on temporal attention, we develop …

Total citations

Cited by 244

20182019202020212022202320246 24 40 45 53 50 24

Scholar articles

Spatio-temporal attention-based LSTM networks for 3D action recognition and detection

S Song, C Lan, J Xing, W Zeng, J Liu - IEEE Transactions on image processing, 2018