• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Ke, Xiao (Ke, Xiao.) [1] (Scholars:柯逍) | Cai, Yuhang (Cai, Yuhang.) [2] | Chen, Baitao (Chen, Baitao.) [3] | Liu, Hao (Liu, Hao.) [4] | Guo, Wenzhong (Guo, Wenzhong.) [5]

Indexed by:

SCIE

Abstract:

Fine-grained visual classification (FGVC) is a highly challenging task that aims to learn subtle differences between visually similar objects. Most existing methods for FGVC rely on deep convolutional neural networks to mine local fine-grained features, which neglect the learning of relationships between global and local semantics. Moreover, the feature encoding stage inevitably constructs complex feature representations, leading to overfitting to specific feature patterns, which is not beneficial for fine-grained visual classification. To address these issues, we propose a Transformer-based FGVC model, called the Multi-Granularity Interaction and Feature Recombination Network(MGIFR-Net), which consists of three modules. Firstly, a self-attention guided localization module is designed to locate and amplify discriminative local regions, enabling the sufficient learning of local detail information. Secondly, to enhance the perception of multi-granularity semantic interaction information, we construct a multi-granularity feature interaction learning module to jointly learn local and global feature representations. Finally, a dynamic feature recombination enhancement method is proposed, which explores diverse feature pattern combinations while retaining invariant features, effectively alleviating the overfitting problem caused by complex feature representations. Our method achieves stateof-the-art performance on four benchmark FGVC datasets (CUB-200-2011, Stanford Cars, FGVC-Aircraft, and NAbirds), and experimental results demonstrate the superiority of our method on different visual classification benchmarks.

Keyword:

Feature recombination Fine-grained visual classification Multi-granularity feature interaction Vision transformer

Community:

  • [ 1 ] [Cai, Yuhang]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China
  • [ 2 ] [Cai, Yuhang]Fuzhou Univ, Engn Res Ctr Big Data Intelligence, Minist Educ, Fuzhou 350116, Peoples R China

Reprint 's Address:

  • [Cai, Yuhang]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China

Show more details

Related Keywords:

Source :

PATTERN RECOGNITION

ISSN: 0031-3203

Year: 2025

Volume: 166

7 . 5 0 0

JCR@2023

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 0

Online/Total:83/10022139
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1