Query:
学者姓名:兰诚栋
Refining:
Year
Type
Indexed by
Source
Complex
Former Name
Co-
Language
Clean All
Abstract :
The introduction of multiple viewpoints in video scenes inevitably increases the bitrates required for storage and transmission. To reduce bitrates, researchers have developed methods to skip intermediate viewpoints during compression and delivery, and ultimately reconstruct them using Side Information (SInfo). Typically, depth maps are used to construct SInfo. However, these methods suffer from reconstruction inaccuracies and inherently high bitrates. In this paper, we propose a novel multi-view video coding method that leverages the image generation capabilities of Generative Adversarial Network (GAN) to improve the reconstruction accuracy of SInfo. Additionally, we consider incorporating information from adjacent temporal and spatial viewpoints to further reduce SInfo redundancy. At the encoder, we construct a spatio-temporal Epipolar Plane Image (EPI) and further utilize a convolutional network to extract the latent code of a GAN as SInfo. At the decoder, we combine the SInfo and adjacent viewpoints to reconstruct intermediate views using the GAN generator. Specifically, we establish a joint encoder constraint for reconstruction cost and SInfo entropy to achieve an optimal trade-off between reconstruction quality and bitrate overhead. Experiments demonstrate the significant improvement in Rate-Distortion (RD) performance compared to state-of-the-art methods.
Keyword :
Epipolar plane image Epipolar plane image Generative adversarial network Generative adversarial network Latent code learning Latent code learning Multi-view video coding Multi-view video coding
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lan, Chengdong , Yan, Hao , Luo, Cheng et al. GAN-based multi-view video coding with spatio-temporal EPI reconstruction [J]. | SIGNAL PROCESSING-IMAGE COMMUNICATION , 2025 , 132 . |
MLA | Lan, Chengdong et al. "GAN-based multi-view video coding with spatio-temporal EPI reconstruction" . | SIGNAL PROCESSING-IMAGE COMMUNICATION 132 (2025) . |
APA | Lan, Chengdong , Yan, Hao , Luo, Cheng , Zhao, Tiesong . GAN-based multi-view video coding with spatio-temporal EPI reconstruction . | SIGNAL PROCESSING-IMAGE COMMUNICATION , 2025 , 132 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Compressed video quality enhancement is crucial for mitigating artifacts introduced by video coding. Video compression often results in unevenly distributed artifacts across different regions of video frames, leading to significant quality variations. Existing algorithms treat all regions uniformly, ignoring these localized differences, limiting their ability to extract high-quality information from reference frames and accurately reconstruct residuals. Additionally, larger temporal gaps between reference and target frames can cause alignment errors, which propagate during fusion and degrade performance. To address these challenges, we propose a Multi-scale Compression Artifact Attention (MSCAA) module that captures artifact distribution, guiding the model to focus on distorted regions. We also introduce a Progressive Fusion Stage that sequentially fuses reference frames based on temporal proximity, reducing error propagation and enhancing temporal coherence. The experimental results show that the proposed method improves average ΔPSNR by 5.15% and ΔSSIM by 3.35% compared to the state-of-the-art method, demonstrating its superior performance in quality enhancement. © 2025 Elsevier Inc.
Keyword :
Compression artifacts Compression artifacts Deep learning Deep learning Video quality enhancement Video quality enhancement
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lan, C. , Lin, W. , Liang, H. et al. Multi-scale compression artifact attention-driven compressed video quality enhancement [J]. | Digital Signal Processing: A Review Journal , 2025 , 163 . |
MLA | Lan, C. et al. "Multi-scale compression artifact attention-driven compressed video quality enhancement" . | Digital Signal Processing: A Review Journal 163 (2025) . |
APA | Lan, C. , Lin, W. , Liang, H. , Kang, Y. . Multi-scale compression artifact attention-driven compressed video quality enhancement . | Digital Signal Processing: A Review Journal , 2025 , 163 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Learning-based point cloud compression has achieved great success in Rate-Distortion (RD) efficiency. Existing methods usually utilize Variational AutoEncoder (VAE) network, which might lead to poor detail reconstruction and high computational complexity. To address these issues, we propose a Scale-adaptive Asymmetric Sparse Variational AutoEncoder (SAS-VAE) in this work. First, we develop an Asymmetric Multiscale Sparse Convolution (AMSC), which exploits multi-resolution branches to aggregate multiscale features at encoder, and excludes symmetric feature fusion branches to control the model complexity at decoder. Second, we design a Scale Adaptive Feature Refinement Structure (SAFRS) to adaptively adjust the number of Feature Refinement Modules (FRMs), thereby improving RD performance with an acceptable computational overhead. Third, we implement our framework with AMSC and SAFRS, and train it with an RD loss based on Fine-grained Weighted Binary Cross-Entropy (FWBCE) function. Experimental results on 8iVFB, Owlii, and MVUV datasets show that our method outperforms several popular methods, with a 90.0% time reduction and a 51.8% BD-BR saving compared with V-PCC. The code will be available soon at https://github.com/fancj2017/SAS-VAE.
Keyword :
asymmetric multiscale sparse convolution asymmetric multiscale sparse convolution Convolution Convolution Decoding Decoding Feature extraction Feature extraction Octrees Octrees Point cloud compression Point cloud compression Rate-distortion Rate-distortion scale adaptive feature refinement structure scale adaptive feature refinement structure Three-dimensional displays Three-dimensional displays variational autoencoder variational autoencoder
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Chen, Jian , Zhu, Yingtao , Huang, Wei et al. Scale-Adaptive Asymmetric Sparse Variational AutoEncoder for Point Cloud Compression [J]. | IEEE TRANSACTIONS ON BROADCASTING , 2024 , 70 (3) : 884-894 . |
MLA | Chen, Jian et al. "Scale-Adaptive Asymmetric Sparse Variational AutoEncoder for Point Cloud Compression" . | IEEE TRANSACTIONS ON BROADCASTING 70 . 3 (2024) : 884-894 . |
APA | Chen, Jian , Zhu, Yingtao , Huang, Wei , Lan, Chengdong , Zhao, Tiesong . Scale-Adaptive Asymmetric Sparse Variational AutoEncoder for Point Cloud Compression . | IEEE TRANSACTIONS ON BROADCASTING , 2024 , 70 (3) , 884-894 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Convolutional neural networks are constrained in adaptively capturing information due to the use of fixed-size kernels. Although they provide a wide receptive field and achieve competitive performance with fewer parameters by using decomposed large kernels, they lack adaptability. Therefore, we propose the dynamic large kernel network (DLKN) for lightweight image super-resolution. Specifically, we design a basic convolutional block of feature aggregation groups, akin to the transformer architecture. It comprises a dynamic large kernel attention block and a local feature enhancement block that can adaptively utilize information. In our dynamic large kernel attention block, we decompose the large kernel convolution into kernels with different sizes and expansion rates. We then fuse their information for weight selection, dynamically adjusting the proportion of information from different receptive fields. The local feature enhancement block significantly improves local feature extraction with low parameter counts. It encourages interactions between local spatial features by decomposing the convolution into horizontally and vertically cascading kernels. Experimental results on benchmark datasets demonstrate that our proposed model achieves excellent performance in lightweight and performance-oriented super-resolution tasks. It successfully balances the relationship between performance and model complexity. The code is available at https://github.com/LyTinGiu/DLKN_SR.
Keyword :
CNN CNN Image processing Image processing Large kernel convolution Large kernel convolution Super-resolution Super-resolution
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, YaTing , Lan, ChengDong , Feng, Wanjian . DLKN: enhanced lightweight image super-resolution with dynamic large kernel network [J]. | VISUAL COMPUTER , 2024 , 41 (5) : 3627-3644 . |
MLA | Liu, YaTing et al. "DLKN: enhanced lightweight image super-resolution with dynamic large kernel network" . | VISUAL COMPUTER 41 . 5 (2024) : 3627-3644 . |
APA | Liu, YaTing , Lan, ChengDong , Feng, Wanjian . DLKN: enhanced lightweight image super-resolution with dynamic large kernel network . | VISUAL COMPUTER , 2024 , 41 (5) , 3627-3644 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
在水下图像的目标检测研究中,水下目标的尺度较小及存在模糊的情况,给检测精度带来了较大的挑战。为了解决水下环境使用通用目标检测模型精度较低的问题,提出一种改进的YOLOv5s检测模型。通过在YOLOv5s检测模型中增加多种滤波处理的数据增强,扩充水下数据样本的数量,同时提高数据的泛化性。同时,对分类和回归损失函数进行相应的改进,更好地进行水下目标的分类和定位。经实验验证,改进的方法适用于水下目标检测,改进的YOLOv5s检测算法在检测速度不变的情况下,平均精度提升了2.1%。
Keyword :
YOLOv5s YOLOv5s 损失函数 损失函数 数据增强 数据增强 水下目标检测 水下目标检测
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 叶志杨 , 梁昊霖 , 兰诚栋 . 应用于水下目标检测的YOLOv5s算法模型 [J]. | 电视技术 , 2023 , 47 (02) : 39-43 . |
MLA | 叶志杨 et al. "应用于水下目标检测的YOLOv5s算法模型" . | 电视技术 47 . 02 (2023) : 39-43 . |
APA | 叶志杨 , 梁昊霖 , 兰诚栋 . 应用于水下目标检测的YOLOv5s算法模型 . | 电视技术 , 2023 , 47 (02) , 39-43 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this letter, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
Keyword :
compression artifact compression artifact Perceivable Encoding Artifacts (PEAs) Perceivable Encoding Artifacts (PEAs) saliency detection saliency detection Video quality assessment Video quality assessment
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lin, Liqun , Zheng, Yang , Chen, Weiling et al. Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment [J]. | IEEE SIGNAL PROCESSING LETTERS , 2023 , 30 : 693-697 . |
MLA | Lin, Liqun et al. "Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment" . | IEEE SIGNAL PROCESSING LETTERS 30 (2023) : 693-697 . |
APA | Lin, Liqun , Zheng, Yang , Chen, Weiling , Lan, Chengdong , Zhao, Tiesong . Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment . | IEEE SIGNAL PROCESSING LETTERS , 2023 , 30 , 693-697 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
视频压缩会给视频带来压缩伪影,从而影响视频质量,但是可以在解码端通过多帧融合让待增强帧从其余帧中学习到高质量的信息从而获得质量提升。基于此,引入光流辅助可变形卷积,使多帧融合达到更好的对齐效果,并充分利用过去以及未来时刻帧对齐后的结果,进一步获得更好的增强效果。
Keyword :
光流法 光流法 压缩视频增强 压缩视频增强 可变形卷积 可变形卷积
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 梁昊霖 , 叶志杨 , 兰诚栋 . 基于光流辅助可变形卷积的压缩视频质量增强 [J]. | 电视技术 , 2023 , 47 (02) : 24-27 . |
MLA | 梁昊霖 et al. "基于光流辅助可变形卷积的压缩视频质量增强" . | 电视技术 47 . 02 (2023) : 24-27 . |
APA | 梁昊霖 , 叶志杨 , 兰诚栋 . 基于光流辅助可变形卷积的压缩视频质量增强 . | 电视技术 , 2023 , 47 (02) , 24-27 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Feature pyramid network (FPN) is a typical detector commonly for solving the issue of object detection at different scales. However, the lateral connections in FPN lead to the loss of feature information due to the reduction of feature channels. Moreover, the top-down feature fusion will weaken the feature representation in the process of feature delivery because of features with different semantic information. In this paper, we propose a feature pyramid network with channel and content adaptive feature enhancement module (CCA-FPN), which uses a channel adaptive guided mechanism module (CAGM) and multi-scale content adaptive feature enhancement module (MCAFEM) to alleviate these problems. We conduct comprehensive experiments on the MS COCO dataset. By replacing FPN with CCA-FPN in ATSS, our models achieve 1.3 percentage points higher Average Precision (AP) when using ResNet50 as backbone. Furthermore, our CCA-FPN achieves 0.3 percentage points higher than the AugFPN which is the state-of-the-art FPN-based detector.
Keyword :
Channel and content adaptive Channel and content adaptive Feature enhancement module Feature enhancement module Feature pyramid network Feature pyramid network Object detection Object detection
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Ye, Zhiyang , Lan, Chengdong , Zou, Min et al. CCA-FPN: Channel and content adaptive object detection [J]. | JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION , 2023 , 95 . |
MLA | Ye, Zhiyang et al. "CCA-FPN: Channel and content adaptive object detection" . | JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION 95 (2023) . |
APA | Ye, Zhiyang , Lan, Chengdong , Zou, Min , Qiu, Xu , Chen, Jian . CCA-FPN: Channel and content adaptive object detection . | JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION , 2023 , 95 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Panoramic video multimedia technology has made significant advancements in recent years, providing users with an immersive experience by displaying the entire 360 degrees spherical scene centered around their virtual location. However, due to its larger data volume compared to traditional video formats, transmitting high-quality videos requires more bandwidth. It is important to note that users do not see the whole 360 degrees content simultaneously, but only a portion that is within their viewport. To save bandwidth, viewport-based adaptive streaming has become a significant technology that transmits only the viewports of interest to the user in high quality. Therefore, the accuracy of viewport prediction plays a crucial role. However, the performance of viewport prediction is affected by the size of the prediction window, which decreases significantly as the window size increases. In order to address this issue, we propose an effective self-attention viewport prediction model based on distance constraint in this paper. Firstly, by analyzing the existing viewport trajectory dataset, we find the randomness and continuity of the viewport trajectory. Secondly, to solve the randomness problem, we design a viewport prediction model based on a self-attention mechanism to provide more trajectory information for long inputs. Thirdly, in order to ensure the continuity of the predicted viewport trajectory, the loss function is modified with the distance constraint to reduce the change in the continuity of prediction results. Finally, the experimental results based on the real viewport trajectory datasets show that the algorithm we propose has higher prediction accuracy and stability compared with the advanced models.
Keyword :
Distance constraints Distance constraints Panoramic video Panoramic video Self-attention Self-attention Viewport prediction Viewport prediction
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lan, ChengDong , Qiu, Xu , Miao, Chenqi et al. A self-attention model for viewport prediction based on distance constraint [J]. | VISUAL COMPUTER , 2023 , 40 (9) : 5997-6014 . |
MLA | Lan, ChengDong et al. "A self-attention model for viewport prediction based on distance constraint" . | VISUAL COMPUTER 40 . 9 (2023) : 5997-6014 . |
APA | Lan, ChengDong , Qiu, Xu , Miao, Chenqi , Zheng, MengTing . A self-attention model for viewport prediction based on distance constraint . | VISUAL COMPUTER , 2023 , 40 (9) , 5997-6014 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Currently, an effective stream adaptation method for stereo panoramic video transmission is missing. However, the traditional panoramic video adaptive streaming strategy for transmitting binocular stereo panoramic video suffers from the problem of doubling the transmission data and requiring huge bandwidth. A multi-agent reinforcement learning based stereo panoramic video asymmetric transmission adaptive streaming method is proposed in this paper to cope with the limited bandwidth and fluctuation of network bandwidth in real time. First, due to the human eye's preference for the saliency regions of video, each tile in the left and right viewpoints of stereoscopic video contributes differently to the perceptual quality, and a tiles-based method for predicting the watching probability of left and right viewpoint is proposed. Second, a multi-agent reinforcement learning framework based on policy-value (Actor-Critic) is designed for joint rate control of left and right viewpoints. Finally, a reasonable reward function is designed based on the model structure and the principle of binocular suppression. The experimental results show that the proposed method is more suitable for tiles-based stereo panoramic video transmission than the traditional self-adaptive stream transmission strategy. A novel approach is proposed for stereo panoramic video joint rate control and user Quality of Experience (QoE) improvement under limited bandwidth. © 2022, Science Press. All right reserved.
Keyword :
Bandwidth Bandwidth Image communication systems Image communication systems Multi agent systems Multi agent systems Quality control Quality control Quality of service Quality of service Reinforcement learning Reinforcement learning Stereo image processing Stereo image processing Video streaming Video streaming
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lan, Chengdong , Rao, Yingjie , Song, Caixia et al. Adaptive Streaming of Stereoscopic Panoramic Video Based on Reinforcement Learning [J]. | Journal of Electronics and Information Technology , 2022 , 44 (4) : 1461-1468 . |
MLA | Lan, Chengdong et al. "Adaptive Streaming of Stereoscopic Panoramic Video Based on Reinforcement Learning" . | Journal of Electronics and Information Technology 44 . 4 (2022) : 1461-1468 . |
APA | Lan, Chengdong , Rao, Yingjie , Song, Caixia , Chen, Jian . Adaptive Streaming of Stereoscopic Panoramic Video Based on Reinforcement Learning . | Journal of Electronics and Information Technology , 2022 , 44 (4) , 1461-1468 . |
Export to | NoteExpress RIS BibTex |
Version :
Export
Results: |
Selected to |
Format: |