Authors
Saining Xie, Tianbao Yang, Xiaoyu Wang, Yuanqing Lin
Publication date
2014
Journal
Computer Vision and Pattern Recognition (CVPR), 2015
Description
Deep convolutional neural networks (CNN) have seen tremendous success in large-scale generic object recognition. In comparison with generic object recognition, fine-grained image classification (FGIC) is much more challenging because (i) fine-grained labeled data is much more expensive to acquire (usually requiring domain expertise);(ii) there exists large intra-class and small inter-class variance. Most recent work exploiting deep CNN for image recognition with small training data adopts a simple strategy: pre-train a deep CNN on a large-scale external dataset (eg, ImageNet) and fine-tune on the small-scale target data to fit the specific classification task. In this paper, beyond the fine-tuning strategy, we propose a systematic framework of learning a deep CNN that addresses the challenges from two new perspectives:(i) identifying easily annotated hyper-classes inherent in the fine-grained data and acquiring a large number of hyper-class-labeled images from readily available external sources (eg, image search engines), and formulating the problem into multi-task learning;(ii) a novel learning model by exploiting a regularization between the fine-grained recognition model and the hyper-class recognition model. We demonstrate the success of the proposed framework on two small-scale fine-grained datasets (Stanford Dogs and Stanford Cars) and on a large-scale car dataset that we collected.
Total citations
201420152016201720182019202020212022202320241322352936292729115
Scholar articles
S Xie, T Yang, X Wang, Y Lin - Proceedings of the IEEE conference on computer …, 2015