Indexed by:
Abstract:
While feature extraction employing pre-trained models proves effective and efficient for no-reference video tasks, it falls short of adequately accounting for the intricacies of the Human Visual System (HVS). In this study, we proposed a novel approach to Integration of spatio-temporal Visual Stimuli into Video Quality Assessment (IVS-VQA) for the inaugural time. Exploiting the heightened sensitivity of optic rod cells to edges and motion, along with the capability to track motion via conjugate gaze, our approach affords a distinctive perspective on video quality assessment. To capture significant changes at each timestamp, we incorporate edge information to enhance the feature extraction of the pre-trained model. To tackle pronounced motion across the timeline, we introduce an interactive temporal disparity query employing a dual-branch transformer architecture. This approach adeptly introduces feature biases and extracts comprehensive global attention, culminating in enhanced emphasis on non-continuous segments within the video. Additionally, we integrate low-level color texture information within the temporal domain to comprehensively capture distortions spanning various scales, both higher and lower. Empirical results illustrate that the proposed model attains state-of-the-art performance across all six benchmark databases, along with their corresponding weighted averages. IEEE
Keyword:
Reprint 's Address:
Email:
Source :
IEEE Transactions on Broadcasting
ISSN: 0018-9316
Year: 2023
Issue: 1
Volume: 70
Page: 1-15
3 . 2
JCR@2023
3 . 2 0 0
JCR@2023
JCR Journal Grade:2
CAS Journal Grade:1
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: