View article

Bidirectional relationship inferring network for referring image localization and segmentation

Authors

Guang Feng, Zhiwei Hu, Lihe Zhang, Jiayu Sun, Huchuan Lu

Publication date

2021/9/1

Journal

IEEE Transactions on Neural Networks and Learning Systems

Volume

Issue

Pages

2246-2258

Publisher

IEEE

Description

Recently, referring image localization and segmentation has aroused widespread interest. However, the existing methods lack a clear description of the interdependence between language and vision. To this end, we present a bidirectional relationship inferring network (BRINet) to effectively address the challenging tasks. Specifically, we first employ a vision-guided linguistic attention module to perceive the keywords corresponding to each image region. Then, language-guided visual attention adopts the learned adaptive language to guide the update of the visual features. Together, they form a bidirectional cross-modal attention module (BCAM) to achieve the mutual guidance between language and vision. They can help the network align the cross-modal features better. Based on the vanilla language-guided visual attention, we further design an asymmetric language-guided visual attention, which significantly …

Total citations

Cited by 14

2022202320244 5 5

Scholar articles

Bidirectional relationship inferring network for referring image localization and segmentation

G Feng, Z Hu, L Zhang, J Sun, H Lu - IEEE Transactions on Neural Networks and Learning …, 2021