Query:
学者姓名:牛玉贞
Refining:
Year
Type
Indexed by
Source
Complex
Co-
Language
Clean All
Abstract :
增强水下图像质量对水下作业领域的发展具有重要意义 . 现有的水下图像增强方法通常基于成对的水下图像和参考图像进行训练,然而实际获取与水下图像对应的参考图像比较困难,相比之下获得非成对高质量水下图像或者陆上图像较为容易. 此外,现有的水下图像增强方法很难同时针对各种失真类型进行图像增强. 为了避免对成对训练数据的依赖和进一步降低获得训练数据的难度,并应对多样的水下图像失真类型,本文提出了一种基于分频式生成对抗网络(Frequency-Decomposed Generative Adversarial Network,FD-GAN)的非成对水下图像增强方法,并在此基础上设计了高低频双分支生成器用于重建高质量水下增强图像. 具体来说,本文引入特征级别的小波变换将特征分为低频和高频部分,并基于循环一致性生成对抗网络对低频和高频部分区分处理. 其中,低频分支采用结合低频注意力机制的编码-解码器结构实现对图像颜色和亮度的增强,高频分支则采用并行的高频注意力机制对各高频分量进行增强,从而实现对图像细节的恢复. 在多个标准水下图像数据集上的实验结果表明,本文提出的方法在使用非成对的高质量水下图像和引入部分陆上图像的情况下,均能有效生成高质量的水下增强图像,且有效性和泛化性均优于当 前主流的水下图像增强方法.
Keyword :
小波变换 小波变换 水下图像增强 水下图像增强 注意力机制 注意力机制 生成对抗网络 生成对抗网络 高低频双分支生成器 高低频双分支生成器
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | 牛玉贞 , 张凌昕 , 兰杰 et al. 基于分频式生成对抗网络的非成对水下图像增强 [J]. | 电子学报 , 2025 . |
MLA | 牛玉贞 et al. "基于分频式生成对抗网络的非成对水下图像增强" . | 电子学报 (2025) . |
APA | 牛玉贞 , 张凌昕 , 兰杰 , 许瑞 , 柯逍 . 基于分频式生成对抗网络的非成对水下图像增强 . | 电子学报 , 2025 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
It is a challenging task to obtain high-quality images in low-light scenarios. While existing low-light image enhancement methods learn the mapping from low-light to clear images, such a straightforward approach lacks the targeted design for real-world scenarios, hampering their practical utility. As a result, issues such as overexposure and color distortion are likely to arise when processing images in uneven luminance or extreme darkness. To address these issues, we propose an adaptive luminance enhancement and high-fidelity color correction network (LCNet), which adopts a strategy of enhancing luminance first and then correcting color. Specifically, in the adaptive luminance enhancement stage, we design a multi-stage dual attention residual module (MDARM), which incorporates parallel spatial and channel attention mechanisms within residual blocks. This module extracts luminance prior from the low-light image to adaptively enhance luminance, while suppressing overexposure in areas with sufficient luminance. In the high-fidelity color correction stage, we design a progressive multi-scale feature fusion module (PMFFM) that combines progressively stage-wise multi-scale feature fusion with long/short skip connections, enabling thorough interaction between features at different scales across stages. This module extracts and fuses color features with varying receptive fields to ensure accurate and consistent color correction. Furthermore, we introduce a multi-color-space loss to effectively constrain the color correction. These two stages together produce high-quality images with appropriate luminance and high-fidelity color. Extensive experiments on both low-level and high-level tasks demonstrate that our LCNet outperforms state-of-the-art methods and achieves superior performance for low-light image enhancement in real-world scenarios.
Keyword :
adaptive luminance enhancement adaptive luminance enhancement Distortion Distortion Feature extraction Feature extraction high-fidelity color correction high-fidelity color correction Histograms Histograms Image color analysis Image color analysis Image enhancement Image enhancement Lighting Lighting Low-light image enhancement Low-light image enhancement luminance prior luminance prior Reflectivity Reflectivity Signal to noise ratio Signal to noise ratio Switches Switches Transformers Transformers
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Niu, Yuzhen , Li, Fusheng , Li, Yuezhou et al. Adaptive Luminance Enhancement and High-Fidelity Color Correction for Low-Light Image Enhancement [J]. | IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING , 2025 , 11 : 732-747 . |
MLA | Niu, Yuzhen et al. "Adaptive Luminance Enhancement and High-Fidelity Color Correction for Low-Light Image Enhancement" . | IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING 11 (2025) : 732-747 . |
APA | Niu, Yuzhen , Li, Fusheng , Li, Yuezhou , Chen, Siling , Chen, Yuzhong . Adaptive Luminance Enhancement and High-Fidelity Color Correction for Low-Light Image Enhancement . | IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING , 2025 , 11 , 732-747 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Enhancing the quality of underwater images is crucial for advancements in the fields of underwater exploration and underwater rescue. Existing underwater image enhancement methods typically rely on paired underwater images and reference images for training. However, obtaining corresponding reference images for underwater images is challenging in practice. In contrast, acquiring high-quality unpaired underwater images or images captured on land are relatively more straightforward. Furthermore, existing techniques for underwater image enhancement often struggle to address a variety of distortion types simultaneously. To avoid the reliance on paired training data, reduce the difficulty of acquiring training data, and effectively handle diverse types of underwater image distortions, in this paper, we propose a novel unpaired underwater image enhancement method based on the frequency-decomposed generative adversarial network (FD-GAN). We design a dual-branch generator based on high and low frequencies to reconstruct high-quality underwater images. Specifically, feature-level wavelet transform is introduced to separate the features into low-frequency and high-frequency parts. Then the separated features are processed by a cycle-consistent generative adversarial network, so as to simultaneously enhance the color and luminance in the low-frequency component and details in the high-frequency part. More specific, the low-frequency branch employs an encoder-decoder structure with a low-frequency attention mechanism to enhance the color and brightness of the image. The high-frequency branch utilizes parallel high-frequency attention mechanisms to enhance various high-frequency components, thereby achieving the restoration of image details. Experimental results on multiple datasets show that the proposed method trained with unpaired high-quality underwater images or unpaired high-quality underwater images and on-land images, can effectively generate high-quality underwater enhanced images and the proposed method is superior to the state-of-the-art underwater image enhancement methods in terms of effectiveness and generalization. © 2025 Chinese Institute of Electronics. All rights reserved.
Keyword :
Color image processing Color image processing Image coding Image coding Image compression Image compression Image enhancement Image enhancement Photointerpretation Photointerpretation Underwater photography Underwater photography Wavelet decomposition Wavelet decomposition
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Niu, Yu-Zhen , Zhang, Ling-Xin , Lan, Jie et al. FD-GAN: Frequency-Decomposed Generative Adversarial Network for Unpaired Underwater Image Enhancement [J]. | Acta Electronica Sinica , 2025 , 53 (2) : 527-544 . |
MLA | Niu, Yu-Zhen et al. "FD-GAN: Frequency-Decomposed Generative Adversarial Network for Unpaired Underwater Image Enhancement" . | Acta Electronica Sinica 53 . 2 (2025) : 527-544 . |
APA | Niu, Yu-Zhen , Zhang, Ling-Xin , Lan, Jie , Xu, Rui , Ke, Xiao . FD-GAN: Frequency-Decomposed Generative Adversarial Network for Unpaired Underwater Image Enhancement . | Acta Electronica Sinica , 2025 , 53 (2) , 527-544 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Camouflaged object detection (COD) aims to resolve the tough issue of accurately segmenting objects hidden in the surroundings. However, the existing methods suffer from two major problems: the incomplete interior and the inaccurate boundary of the object. To address these difficulties, we propose a three-stage skeletonboundary-guided network (SBGNet) for the COD task. Specifically, we design a novel skeleton-boundary label to be complementary to the typical pixel-wise mask annotation, emphasizing the interior skeleton and the boundary of the camouflaged object. Furthermore, the proposed feature guidance module (FGM) leverages the skeleton-boundary feature to guide the model to focus on both the interior and the boundary of the camouflaged object. Besides, we design a bidirectional feature flow path with the information interaction module (IIM) to propagate and integrate the semantic and texture information. Finally, we propose the dual feature distillation module (DFDM) to progressively refine the segmentation results in a fine-grained manner. Comprehensive experiments demonstrate that our SBGNet outperforms 20 state-of-the-art methods on three benchmarks in both qualitative and quantitative comparisons. CCS Concepts: center dot Computing methodologies -> Scene understanding;
Keyword :
Bidirectional feature flow path Bidirectional feature flow path Camouflaged object detection Camouflaged object detection Feature distillation Feature distillation Skeleton-boundary guidance Skeleton-boundary guidance
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Niu, Yuzhen , Xu, Yeyuan , Li, Yuezhou et al. Skeleton-Boundary-Guided Network for Camouflaged Object Detection [J]. | ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS , 2025 , 21 (3) . |
MLA | Niu, Yuzhen et al. "Skeleton-Boundary-Guided Network for Camouflaged Object Detection" . | ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS 21 . 3 (2025) . |
APA | Niu, Yuzhen , Xu, Yeyuan , Li, Yuezhou , Zhang, Jiabang , Chen., Yuzhong . Skeleton-Boundary-Guided Network for Camouflaged Object Detection . | ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS , 2025 , 21 (3) . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Image aesthetic assessment (IAA) has drawn wide attention in recent years. This task aims to predict the aesthetic quality of images by simulating human aesthetic perception mechanism, thereby assisting users in selecting images with higher aesthetic value. For IAA, the local information and various global semantic information contained in an image, such as composition, theme, and emotion, all play a crucial role. Existing CNN-based methods attempt to use multi-branch strategies to extract local and global semantic information related to IAA from images. However, these methods can only extract limited and specific global semantic information, and requiring additional labeled datasets. Furthermore, some cross-modal IAA methods have been proposed to use both images and user comments, but they often fail to fully explore the valuable information within each modality and the correlations between cross-modal features, affecting cross-modal IAA accuracy. Considering these limitations, in this paper, we propose a cross-modal IAA model that progressively fuses local and global image features. The model consists of a progressive local and global image feature fusion branch, a text feature enhancement branch, and a cross-modal feature fusion module. In the image branch, we introduce an inter-layer feature fusion module (IFFM) and adopt a progressive way to interact and fuse the extracted local and global features to obtain more comprehensive image features. In the text branch, we propose a text feature enhancement module (TFEM) to strengthen the extracted text features, so as to mine more effective textual information. Meanwhile, considering the intrinsic correlation between image and text features, we propose a cross-modal feature fusion module (CFFM) to integrate and fuse image features with text features for aesthetic assessment. Experimental results on the AVA (Aesthetic Visual Analysis) dataset validate the superiority of our method for IAA task.
Keyword :
Cross-modality Cross-modality Feature fusion Feature fusion Image aesthetic assessment Image aesthetic assessment Local and global features Local and global features Textual information Textual information
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Niu, Yuzhen , Chen, Siling , Chen, Shanshan et al. Progressive fusion of local and global image features for cross-modal image aesthetic assessment [J]. | MULTIMEDIA SYSTEMS , 2025 , 31 (2) . |
MLA | Niu, Yuzhen et al. "Progressive fusion of local and global image features for cross-modal image aesthetic assessment" . | MULTIMEDIA SYSTEMS 31 . 2 (2025) . |
APA | Niu, Yuzhen , Chen, Siling , Chen, Shanshan , Li, Fusheng . Progressive fusion of local and global image features for cross-modal image aesthetic assessment . | MULTIMEDIA SYSTEMS , 2025 , 31 (2) . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Low-light image enhancement (LLIE) is a challenging task, due to the multiple degradation problems involved, such as low brightness, color distortion, heavy noise, and detail degradation. Existing deep learning-based LLIE methods mainly use encoder-decoder networks or full-resolution networks, which excel at extracting context or detail information, respectively. Since detail and context information are both required for LLIE, existing methods cannot solve all the degradation problems. To solve the above problem, we propose an LLIE method based on collaboratively enhanced and integrated detail-context information (CoEIDC). Specifically, we propose a full-resolution network with two collaborative subnetworks, namely the detail extraction and enhancement subnetwork (DE2-Net) and context extraction and enhancement subnetwork (CE2-Net). CE2-Net extracts context information from the features of DE2-Net at different stages through large receptive field convolutions. Moreover, a collaborative attention module (CAM) and a detail-context integration module are proposed to enhance and integrate detail and context information. CAM is reused to enhance the detail features from multi-receptive fields and the context features from multiple stages. Extensive experimental results demonstrate that our method outperforms the state-of-the-art LLIE methods, and is applicable to other image enhancement tasks, such as underwater image enhancement.
Keyword :
Collaborative enhancement and integration Collaborative enhancement and integration Color/brightness correction Color/brightness correction Detail reconstruction Detail reconstruction Low-light image enhancement Low-light image enhancement
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Niu, Yuzhen , Lin, Xiaofeng , Xu, Huangbiao et al. Collaboratively enhanced and integrated detail-context information for low enhancement [J]. | PATTERN RECOGNITION , 2025 , 162 . |
MLA | Niu, Yuzhen et al. "Collaboratively enhanced and integrated detail-context information for low enhancement" . | PATTERN RECOGNITION 162 (2025) . |
APA | Niu, Yuzhen , Lin, Xiaofeng , Xu, Huangbiao , Xu, Rui , Chen, Yuzhong . Collaboratively enhanced and integrated detail-context information for low enhancement . | PATTERN RECOGNITION , 2025 , 162 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Pedestrian attribute recognition (PAR) involves accurately identifying multiple attributes present in pedestrian images. There are two main approaches for PAR: part-based method and attention-based method. The former relies on existing segmentation or region detection methods to localize body parts and learn corresponding attribute-specific feature from the corresponding regions, where the performance heavily depends on the accuracy of body region localization. The latter adopts the embedded attention modules or transformer attention to exploit detailed feature. However, it can focus on certain body regions but often provide coarse attention, failing to capture fine-grained details, the learned feature may also be interfered with by irrelevant information. Meanwhile, these methods overlook the global contextual information. This work argues for replacing coarse attention with detailed attention and integrating it with global contextual feature from ViT to jointly represent attribute-specific regions. To tackle this issue, we propose a High-order Diversity Feature Learning (HDFL) method for PAR based on ViT. We utilize a polynomial predictor to design an Attribute-specific Detailed Feature Exploration (ADFE) module, which can construct the high-order statistics and gain more fine-grained feature. Our ADFE module is a parameter-friendly method that provides flexibility in deciding its utilization during the inference phase. A Soft-redundancy Perception Loss (SPLoss) is proposed to adaptively measure the redundancy between feature of different orders, which can promote diverse characterization of features. Experiments on several PAR datasets show that our method achieves a new stateof-the-art (SOTA) performance. On the most challenging PA100K dataset, our method outperforms previous SOTA by 1.69% and achieves the highest mA of 84.92%.
Keyword :
High-order diversity feature learning High-order diversity feature learning Pedestrian attribute recognition Pedestrian attribute recognition Soft-redundancy perception loss Soft-redundancy perception loss
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wu, Junyi , Huang, Yan , Gao, Min et al. High-order diversity feature learning for pedestrian attribute recognition [J]. | NEURAL NETWORKS , 2025 , 188 . |
MLA | Wu, Junyi et al. "High-order diversity feature learning for pedestrian attribute recognition" . | NEURAL NETWORKS 188 (2025) . |
APA | Wu, Junyi , Huang, Yan , Gao, Min , Niu, Yuzhen , Chen, Yuzhong , Wu, Qiang . High-order diversity feature learning for pedestrian attribute recognition . | NEURAL NETWORKS , 2025 , 188 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Pedestrian Attribute Recognition (PAR) plays a crucial role in various computer vision applications, demanding precise and reliable identification of attributes from pedestrian images. Traditional PAR methods, though effective in leveraging attention mechanisms, often suffer from the lack of direct supervision on attention, leading to potential overfitting and misallocation. This paper introduces a novel and model-agnostic approach, Attention-Aware Regularization (AAR), which rethinks the attention mechanism by integrating causal reasoning to provide direct supervision of attention maps. AAR employs perturbation techniques and a unique optimization objective to assess and refine attention quality, encouraging the model to prioritize attribute-specific regions. Our method demonstrates significant improvement in PAR performance by mitigating the effects of incorrect attention and fostering a more effective attention mechanism. Experiments on standard datasets showcase the superiority of our approach over existing methods, setting a new benchmark for attention-driven PAR models.
Keyword :
Attention-aware regularization Attention-aware regularization Attention mechanism Attention mechanism Pedestrian attribute recognition Pedestrian attribute recognition
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wu, Junyi , Huang, Yan , Gao, Min et al. Rethinking attention mechanism for enhanced pedestrian attribute recognition [J]. | NEUROCOMPUTING , 2025 , 639 . |
MLA | Wu, Junyi et al. "Rethinking attention mechanism for enhanced pedestrian attribute recognition" . | NEUROCOMPUTING 639 (2025) . |
APA | Wu, Junyi , Huang, Yan , Gao, Min , Niu, Yuzhen , Chen, Yuzhong , Wu, Qiang . Rethinking attention mechanism for enhanced pedestrian attribute recognition . | NEUROCOMPUTING , 2025 , 639 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Pedestrian Attribute Recognition (PAR) involves identifying the attributes of individuals in person images. Existing PAR methods typically rely on CNNs as the backbone network to extract pedestrian features. However, CNNs process only one adjacent region at a time, leading to the loss of long-range inter-relations between different attribute-specific regions. To address this limitation, we leverage the Vision Transformer (ViT) instead of CNNs as the backbone for PAR, aiming to model long-range relations and extract more robust features. However, PAR suffers from an inherent attribute imbalance issue, causing ViT to naturally focus more on attributes that appear frequently in the training set and ignore some pedestrian attributes that appear less. The native features extracted by ViT are not able to tolerate the imbalance attribute distribution issue. To tackle this issue, we propose two novel components: the Selective Feature Activation Method (SFAM) and the Orthogonal Feature Activation Loss. SFAM smartly suppresses the more informative attribute-specific features, compelling the PAR model to capture discriminative features from regions that are easily overlooked. The proposed loss enforces an orthogonal constraint on the original feature extracted by ViT and the suppressed features from SFAM, promoting the complementarity of features in space. We conduct experiments on several benchmark PAR datasets, including PETA, PA100K, RAPv1, and RAPv2, demonstrating the effectiveness of our method. Specifically, our method outperforms existing state-of-the-art approaches by GRL, IAA-Caps, ALM, and SSC in terms of mA on the four datasets, respectively. Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Keyword :
Artificial intelligence Artificial intelligence Chemical activation Chemical activation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wu, Junyi , Huang, Yan , Gao, Min et al. Selective and Orthogonal Feature Activation for Pedestrian Attribute Recognition [C] . 2024 : 6039-6047 . |
MLA | Wu, Junyi et al. "Selective and Orthogonal Feature Activation for Pedestrian Attribute Recognition" . (2024) : 6039-6047 . |
APA | Wu, Junyi , Huang, Yan , Gao, Min , Niu, Yuzhen , Yang, Mingjing , Gao, Zhipeng et al. Selective and Orthogonal Feature Activation for Pedestrian Attribute Recognition . (2024) : 6039-6047 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Super-Resolution (SR) algorithms aim to enhance the resolutions of images. Massive deep-learning-based SR techniques have emerged in recent years. In such case, a visually appealing output may contain additional details compared with its reference image. Accordingly, fully referenced Image Quality Assessment (IQA) cannot work well; however, reference information remains essential for evaluating the qualities of SR images. This poses a challenge to SR-IQA: How to balance the referenced and no-reference scores for user perception? In this paper, we propose a Perception-driven Similarity-Clarity Tradeoff (PSCT) model for SR-IQA. Specifically, we investigate this problem from both referenced and no-reference perspectives, and design two deep-learning-based modules to obtain referenced and no-reference scores. We present a theoretical analysis based on Human Visual System (HVS) properties on their tradeoff and also calculate adaptive weights for them. Experimental results indicate that our PSCT model is superior to the state-of-the-arts on SR-IQA. In addition, the proposed PSCT model is also capable of evaluating quality scores in other image enhancement scenarios, such as deraining, dehazing and underwater image enhancement. The source code is available at https://github.com/kekezhang112/PSCT.
Keyword :
Adaptation models Adaptation models Distortion Distortion Feature extraction Feature extraction Image quality assessment Image quality assessment image super-resolution image super-resolution Measurement Measurement perception-driven perception-driven Quality assessment Quality assessment similarity-clarity tradeoff similarity-clarity tradeoff Superresolution Superresolution Task analysis Task analysis
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhang, Keke , Zhao, Tiesong , Chen, Weiling et al. Perception-Driven Similarity-Clarity Tradeoff for Image Super-Resolution Quality Assessment [J]. | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2024 , 34 (7) : 5897-5907 . |
MLA | Zhang, Keke et al. "Perception-Driven Similarity-Clarity Tradeoff for Image Super-Resolution Quality Assessment" . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 34 . 7 (2024) : 5897-5907 . |
APA | Zhang, Keke , Zhao, Tiesong , Chen, Weiling , Niu, Yuzhen , Hu, Jinsong , Lin, Weisi . Perception-Driven Similarity-Clarity Tradeoff for Image Super-Resolution Quality Assessment . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2024 , 34 (7) , 5897-5907 . |
Export to | NoteExpress RIS BibTex |
Version :
Export
Results: |
Selected to |
Format: |