Authors
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik
Publication date
2015/5/25
Journal
IEEE transactions on pattern analysis and machine intelligence
Volume
38
Issue
1
Pages
142-158
Publisher
IEEE
Description
Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 50 percent relative to the previous best result on VOC 2012-achieving a mAP of 62.4 percent. Our approach combines two ideas: (1) one can apply high-capacity convolutional networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data are scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, boosts performance significantly. Since we combine region proposals with CNNs, we call the resulting model …
Total citations
201520162017201820192020202120222023202414101247372392414501469462237
Scholar articles
R Girshick, J Donahue, T Darrell, J Malik - IEEE transactions on pattern analysis and machine …, 2015