Indexed by:
Abstract:
Fine-grained visual classification (FGVC) aims to identify objects belonging to multiple sub-categories of the same super-category. The key to solving fine-grained classification problems is to learn discriminative visual feature representation with only subtle differences. Although previous work based on refined feature learning has made great progress, however, high-level semantic features often lack key information for fine-grained visual object nuances. How to efficiently integrate semantic information of different granularities from classification networks is a critical. In this paper, we propose Granularity-aware Distillation and Structure Modeling region Proposal Network(GDSMP-Net). Our solution integrates multi-granularity hierarchical information through a multi-granularity fusion learning strategy to enhance feature representation. In view of the inherent challenges of large intra-class differences in FGVC, a cross-layer self-distillation regularization is proposed to to strengthen the connection between high-level semantics and low-level semantics for robust multi-granularity feature learning. On this basis, we use a weakly supervised method to generate local branches, and the collaborative learning of discriminative semantics and structural semantics based on local regions, facilitating model to perceive contextual information to capture structural interactions between local semantics. Comprehensive experiments show that our method achieves state-of-the-art performance on four widely-used challenging datasets.(CUB-200-2011, Stanford Cars, FGVC-Aircraft and NA-birds). © 2023 Elsevier Ltd
Keyword:
Reprint 's Address:
Email:
Source :
Pattern Recognition
ISSN: 0031-3203
Year: 2023
Volume: 137
7 . 5
JCR@2023
7 . 5 0 0
JCR@2023
ESI HC Threshold:35
JCR Journal Grade:1
CAS Journal Grade:1
Cited Count:
SCOPUS Cited Count: 37
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: