Counting attention based on classification confidence for visual question answering - Details

author：

Chen, M. (Chen, M..) ^[1] | Wang, Y. (Wang, Y..) ^[2] | Chen, S. (Chen, S..) ^[3] | Wu, Y. (Wu, Y..) ^[4]

Indexed by：

Scopus

Abstract：

The　multi-object　counting　in　visual　question　answering　(VQA)　is　still　a　challenging　problem.　Existing　VQA　models　mainly　adopt　object　detection　network　to　extract　image　features　and　combine　soft　attention　mechanism　to　further　increase　the　model　accuracy.　However,　repeated　counting　of　the　same　object　may　occur　when　the　object　detection　network　extracts　image　features.　In　addition,　the　sum　of　attention　weights　of　all　objects　calculated　by　soft　attention　mechanism　is　1,　which　leads　to　the　constant　quantity　information　of　objects　being　1.　We　propose　a　new　counting　attention　mechanism　based　on　classification　confidence.　The　main　idea　is　to　calculate　the　initial　attention　with　sigmoid　function　and　similarity　with　the　object　location　generated　by　object　detection　network;　we　introduce　classification　confidence　to　calculate　a　more　accurate　similarity　and　solve　the　problem　that　the　quantity　information　under　existing　soft　attention　mechanism　is　always　1.　The　experiment　compares　the　proposed　counting　attention　mechanism　with　the　baseline　model　and　the　related　work　under　the　VQA　v2　dataset.　The　results　show　that　the　counting　attention　mechanism　improves　the　counting　accuracy　by　6.4%　compared　with　the　baseline　model　and　surpasses　most　VQA　models.　©　2019　IEEE.

Keyword：

Artificial intelligence; Neural network; Visual question answering; Visual reasoning

Community：

[ 1 ] [Chen, M.]College of Mathematcis and Computer Science, Fuzhou University, Fuzhou, China
[ 2 ] [Wang, Y.]College of Mathematcis and Computer Science, Fuzhou University, Fuzhou, China
[ 3 ] [Chen, S.]School of Electrical Engineering, Chongqing University, Chongqing, China
[ 4 ] [Wu, Y.]College of Mathematcis and Computer Science, Fuzhou University, Fuzhou, China

Reprint 's Address：

Email：

Show more details

Related Keywords：

Source ：

SocialCom 2019

Year： 2019

Page： 1173-1179

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 3

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 7

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to