• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Ke, Xiao (Ke, Xiao.) [1] (Scholars:柯逍) | Zou, Jiawei (Zou, Jiawei.) [2] | Niu, Yuzhen (Niu, Yuzhen.) [3] (Scholars:牛玉贞)

Indexed by:

EI Scopus SCIE

Abstract:

Automatic image annotation is a key step in image retrieval and image understanding. In this paper, we present an end-to-end automatic image annotation method based on a deep convolutional neural network (CNN) and multi-label data augmentation. Different from traditional annotation models that usually perform feature extraction and annotation as two independent tasks, we propose an end-to-end automatic image annotation model based on deep CNN (E2E-DCNN). E2E-DCNN transforms the image annotation problem into a multi-label learning problem. It uses a deep CNN structure to carry out the adaptive feature learning before constructing the end-to-end annotation structure using multiple cross-entropy loss functions for training. It is difficult to train a deep CNN model using small-scale datasets or scale up multi-label datasets using traditional data augmentation methods; hence, we propose a multi-label data augmentation method based on Wasserstein generative adversarial networks (ML-WGAN). The ML-WGAN generator can approximate the data distribution of a single multi-label image. The images generated by ML-WGAN can assist in the reduction of the over-fitting problem of training a deep CNN model and enhance the generalization ability of the trained CNN model. We optimize the network structure by using deformable convolution and spatial pyramid pooling. We experiment the proposed E2E-DCNN model with data augmentation by the proposed ML-WGAN on several public datasets. The experimental results demonstrate that the proposed model outperforms the state-of-the-art automatic image annotation models.

Keyword:

convolutional neural network data augmentation deep learning generative adversarial networks Image annotation

Community:

  • [ 1 ] [Ke, Xiao]Fuzhou Univ, Minist Educ, Coll Math & Comp Sci, Fujian Key Lab Network Comp & Intelligent Informa, Fuzhou 350116, Fujian, Peoples R China
  • [ 2 ] [Niu, Yuzhen]Fuzhou Univ, Minist Educ, Coll Math & Comp Sci, Fujian Key Lab Network Comp & Intelligent Informa, Fuzhou 350116, Fujian, Peoples R China
  • [ 3 ] [Ke, Xiao]Fuzhou Univ, Minist Educ, Key Lab Spatial Data Min & Informat Sharing, Fuzhou 350116, Fujian, Peoples R China
  • [ 4 ] [Niu, Yuzhen]Fuzhou Univ, Minist Educ, Key Lab Spatial Data Min & Informat Sharing, Fuzhou 350116, Fujian, Peoples R China
  • [ 5 ] [Zou, Jiawei]Fuzhou Univ, Coll Math & Comp Sci, Fujian Key Lab Network Comp & Intelligent Informa, Fuzhou 350116, Fujian, Peoples R China

Reprint 's Address:

  • 牛玉贞

    [Niu, Yuzhen]Fuzhou Univ, Minist Educ, Coll Math & Comp Sci, Fujian Key Lab Network Comp & Intelligent Informa, Fuzhou 350116, Fujian, Peoples R China;;[Niu, Yuzhen]Fuzhou Univ, Minist Educ, Key Lab Spatial Data Min & Informat Sharing, Fuzhou 350116, Fujian, Peoples R China

Show more details

Related Keywords:

Source :

IEEE TRANSACTIONS ON MULTIMEDIA

ISSN: 1520-9210

Year: 2019

Issue: 8

Volume: 21

Page: 2093-2106

6 . 0 5 1

JCR@2019

8 . 4 0 0

JCR@2023

ESI Discipline: COMPUTER SCIENCE;

ESI HC Threshold:162

JCR Journal Grade:1

CAS Journal Grade:1

Cited Count:

WoS CC Cited Count: 72

SCOPUS Cited Count: 87

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Online/Total:77/10011700
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1