• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Lin, Rong (Lin, Rong.) [1] | Zhang, Chun-Yang (Zhang, Chun-Yang.) [2] (Scholars:张春阳)

Indexed by:

EI

Abstract:

Visual Question Answering (VQA) is an extremely stimulating and challenging research area which requires joint image content and language understanding to answer questions about a given image. The existing VQA models have made many efforts in the direction of improving the ability to understand images and achieved good results. However, these models ignore the relationship between the image and the corresponding question, which is strongly correlated when the data is collected (i.e., the question asked by the staff should be consistent with the current image). To capture the essential features between the image and the question, we propose a new VQA model based on contrastive learning by maximizing mutual information. The core idea of our model is to maximize the mutual information between question features and corresponding image features, and minimize the mutual information between question features and their irrelevant image features, so as to improve the ability of image understanding and question understanding of the model simultaneously. The experimental results indicate that the feature representation learned by our model is more representative, and the performance of our model on VQA v1.0 and VQA v2.0 datasets is better than the baseline model. © 2021 IEEE.

Keyword:

Computer vision Image enhancement Visual languages

Community:

  • [ 1 ] [Lin, Rong]College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
  • [ 2 ] [Zhang, Chun-Yang]College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China

Reprint 's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2021

Page: 483-488

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 3

Online/Total:155/10032014
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1