Indexed by:
Abstract:
Versatile Video Coding (VVC) introduces various advanced coding techniques and tools, such as QuadTree with nested Multi-type Tree (QTMT) partition structure, and outperforms High Efficiency Video Coding (HEVC) in terms of coding performance. However, the improvement of coding performance leads to an increase in coding complexity. In this paper, we propose a multi-feature fusion framework that integrates the rate-distortion-complexity optimization theory with deep learning techniques to reduce the complexity of QTMT partition for VVC inter-prediction. Firstly, the proposed framework extracts features of luminance, motion, residuals, and quantization information from video frames and then performs feature fusion through a convolutional neural network to predict the minimum partition size of Coding Units (CUs). Next, a novel rate-distortion-complexity loss function is designed to balance computational complexity and compression performance. Then, through this loss function, we can adjust various distributions of rate-distortion-complexity costs. This adjustment impacts the prediction bias of the network and sets constraints on different block partition sizes to facilitate complexity adjustment. Compared to anchor VTM-13.0, the proposed method saves the encoding time by 10.14% to 56.62%, with BDBR increase confined to a range of 0.31% to 6.70%. The proposed method achieves a broader range of complexity adjustments while ensuring coding performance, surpassing both traditional methods and deep learning-based methods. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.
Keyword:
Reprint 's Address:
Email:
Source :
Journal of Real-Time Image Processing
ISSN: 1861-8200
Year: 2024
Issue: 6
Volume: 21
2 . 9 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: