Home>Results

  • Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

[期刊论文]

Progressive fusion of local and global image features for cross-modal image aesthetic assessment

Share
Edit Delete 报错

author:

Niu, Yuzhen (Niu, Yuzhen.) [1] (Scholars:牛玉贞) | Chen, Siling (Chen, Siling.) [2] | Chen, Shanshan (Chen, Shanshan.) [3] | Unfold

Indexed by:

EI Scopus SCIE

Abstract:

Image aesthetic assessment (IAA) has drawn wide attention in recent years. This task aims to predict the aesthetic quality of images by simulating human aesthetic perception mechanism, thereby assisting users in selecting images with higher aesthetic value. For IAA, the local information and various global semantic information contained in an image, such as composition, theme, and emotion, all play a crucial role. Existing CNN-based methods attempt to use multi-branch strategies to extract local and global semantic information related to IAA from images. However, these methods can only extract limited and specific global semantic information, and requiring additional labeled datasets. Furthermore, some cross-modal IAA methods have been proposed to use both images and user comments, but they often fail to fully explore the valuable information within each modality and the correlations between cross-modal features, affecting cross-modal IAA accuracy. Considering these limitations, in this paper, we propose a cross-modal IAA model that progressively fuses local and global image features. The model consists of a progressive local and global image feature fusion branch, a text feature enhancement branch, and a cross-modal feature fusion module. In the image branch, we introduce an inter-layer feature fusion module (IFFM) and adopt a progressive way to interact and fuse the extracted local and global features to obtain more comprehensive image features. In the text branch, we propose a text feature enhancement module (TFEM) to strengthen the extracted text features, so as to mine more effective textual information. Meanwhile, considering the intrinsic correlation between image and text features, we propose a cross-modal feature fusion module (CFFM) to integrate and fuse image features with text features for aesthetic assessment. Experimental results on the AVA (Aesthetic Visual Analysis) dataset validate the superiority of our method for IAA task.

Keyword:

Cross-modality Feature fusion Image aesthetic assessment Local and global features Textual information

Community:

  • [ 1 ] [Niu, Yuzhen]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350108, Fujian, Peoples R China
  • [ 2 ] [Chen, Siling]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350108, Fujian, Peoples R China
  • [ 3 ] [Chen, Shanshan]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350108, Fujian, Peoples R China
  • [ 4 ] [Li, Fusheng]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350108, Fujian, Peoples R China
  • [ 5 ] [Niu, Yuzhen]Minist Educ, Engn Res Ctr Big Data Intelligence, Fuzhou 350108, Fujian, Peoples R China

Reprint 's Address:

  • 待查

    [Li, Fusheng]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350108, Fujian, Peoples R China

Show more details

Related Article:

Source :

MULTIMEDIA SYSTEMS

ISSN: 0942-4962

Year: 2025

Issue: 2

Volume: 31

3 . 5 0 0

JCR@2023

CAS Journal Grade:4

Cited Count:

WoS CC Cited Count:

30 Days PV: 1

查看更多>>操作日志

管理员  2025-06-25 20:35:47  追加

管理员  2025-05-27 16:00:20  追加

闫春丽  2025-05-23 15:33:35  数据初审

管理员  2025-04-25 18:19:22  追加

Online/Total:192/10802432
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1