Authors
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba
Publication date
2016
Conference
Proceedings of the IEEE conference on computer vision and pattern recognition
Pages
2921-2929
Description
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization ability despite being trained on image-level labels. While this technique was previously proposed as a means for regularizing training, we find that it actually builds a generic localizable deep representation that exposes the implicit attention of CNNs on image. Despite the apparent simplicity of global average pooling, we are able to achieve 37.1% top-5 error for object localization on ILSVRC 2014 without training on any bounding box annotation. We demonstrate that our network is able to localize the discriminative image regions on a variety of tasks despite not being trained for them.
Total citations
20162017201820192020202120222023202443245570100415252057238524761444
Scholar articles
B Zhou, A Khosla, A Lapedriza, A Oliva, A Torralba - Proceedings of the IEEE conference on computer …, 2016
B Zhou, A Khosla - A., A. Oliva, and A. Torralba. Learning Deep Features …, 2016
B Zhou, A Khosla - Learning deep features for discriminative localization …, 2015