View article

A Simple Visual-Textual Baseline for Pedestrian Attribute Recognition

Authors

Xinhua Cheng, Mengxi Jia, Qian Wang, Jian Zhang

Publication date

2022/5/26

Journal

IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

Publisher

IEEE

Description

Pedestrian attribute recognition (PAR), which aims to identify attributes of the pedestrians captured in video surveillance, is a challenging task due to the poor quality of images and diverse spatial distribution among attributes. Existing methods usually model PAR as a multi-label classification problem and manually map attributes to an ordered list corresponding to the outputs of classifiers or sequential models. However, the inherent textual information among attribute annotations is largely neglected in these visual-only methods. In this paper, we first alleviate this issue by proposing a novel visual-textual baseline (VTB) for PAR which introduces an additional textual modality to explore the textual semantic correlations from attribute annotations by pre-trained textual encoders instead of human definitions. VTB encodes pedestrian images and attribute annotations into visual and textual features respectively, interacts …

Total citations

Cited by 29

2022202320242 13 14

Scholar articles

A simple visual-textual baseline for pedestrian attribute recognition

X Cheng, M Jia, Q Wang, J Zhang - IEEE Transactions on Circuits and Systems for Video …, 2022

Cited by 29 Related articles