Authors
Shaochuan Zhao, Tianyang Xu, Xiao-Jun Wu, Josef Kittler
Publication date
2023/11/30
Journal
International Journal of Computer Vision
Pages
1-14
Publisher
Springer US
Description
The robustness of visual object tracking is reflected not only in the accuracy of the target localisation in every single frame, but also in the smoothness of the predicted motion of the tracked object across consecutive frames. From the perspective of appearance modelling, the success of the state-of-the-art Transformer-based trackers derives from their ability to adaptively associate the representations of related spatial regions. However, the absence of attention in the channel dimension hinders the realisation of their potential tracking capacity. To cope with the commonly occurring misalignment of the spatial scale between the template and a search patch, we propose a novel cross channel correlation mechanism. Accordingly, the relevance of multi-channel features in the channel Transformer is modelled using two different sources of information. The result is a novel spatial-channel Transformer, which integrates …
Scholar articles
S Zhao, T Xu, XJ Wu, J Kittler - International Journal of Computer Vision, 2023