Query:
学者姓名:刘文犀
Refining:
Year
Type
Indexed by
Source
Complex
Co-
Language
Clean All
Abstract :
Crowd counting has drawn increasing attention across various fields. However, existing crowd counting tasks primarily focus on estimating the overall population, ignoring the behavioral and semantic information of different social groups within the crowd. In this paper, we aim to address a newly proposed research problem, namely fine-grained crowd counting, which involves identifying different categories of individuals and accurately counting them in static images. In order to fully leverage the categorical information in static crowd images, we propose a two-tier salient feature propagation module designed to sequentially extract semantic information from both the crowd and its surrounding environment. Additionally, we introduce a category difference loss to refine the feature representation by highlighting the differences between various crowd categories. Moreover, our proposed framework can adapt to a novel problem setup called few-example fine-grained crowd counting. This setup, unlike the original fine-grained crowd counting, requires only a few exemplar point annotations instead of dense annotations from predefined categories, making it applicable in a wider range of scenarios. The baseline model for this task can be established by substituting the loss function in our proposed model with a novel hybrid loss function that integrates point-oriented cross-entropy loss and category contrastive loss. Through comprehensive experiments, we present results in both the formulation and application of fine-grained crowd counting.
Keyword :
Adaptation models Adaptation models Annotations Annotations contrastive learning contrastive learning Contrastive learning Contrastive learning Crowd counting Crowd counting Feature extraction Feature extraction few-example fine-grained crowd counting few-example fine-grained crowd counting fine-grained crowd counting fine-grained crowd counting Fuses Fuses Meteorology Meteorology Propagation losses Propagation losses Semantics Semantics Social groups Social groups Visualization Visualization
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhang, Meijing , Chen, Mengxue , Li, Qi et al. Category-Contrastive Fine-Grained Crowd Counting and Beyond [J]. | IEEE TRANSACTIONS ON MULTIMEDIA , 2025 , 27 : 477-488 . |
MLA | Zhang, Meijing et al. "Category-Contrastive Fine-Grained Crowd Counting and Beyond" . | IEEE TRANSACTIONS ON MULTIMEDIA 27 (2025) : 477-488 . |
APA | Zhang, Meijing , Chen, Mengxue , Li, Qi , Chen, Yanchen , Lin, Rui , Li, Xiaolian et al. Category-Contrastive Fine-Grained Crowd Counting and Beyond . | IEEE TRANSACTIONS ON MULTIMEDIA , 2025 , 27 , 477-488 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
U-Net is a classic architecture for semantic segmentation. However, it has several limitations, such as difficulty in capturing complex images detail due to its simple U structure, long convergence time arising from fixed network parameters, and suboptimal efficacy in decoding and restoring multi-scale information. To deal with the above issues, we propose a Multiple U-shaped network (Multi-UNet) assuming that constructing appropriate U-shaped structure can achieve better segmentation performance. Firstly, inspired by the concept of connecting multiple similar blocks, our Multi-UNet consists of multiple U-block modules, with each succeeding module directly connected to the previous one to facilitate data transmission between different U structures. We refer to the original bridge connections of U-Net as Intra-U connections and introduce a new type of connection called Inter-U connections. These Inter-U connections aim to retain as much detailed information as possible, enabling effective detection of complex images. Secondly, while maintaining Mean Intersection over Union (Mean-IoU), the up-sampling of each U applies uniformly small channel values to reduce the number of model parameters. Thirdly, a Spatial-Channel Parallel Attention Fusion (SCPAF) module is designed at the initial layer of every subsampling module of U-block architecture. It enhances feature extraction and alleviate computational overhead associated with data transmission. Finally, we replace the final up-sampling module with Atrous Spatial Pyramid Pooling Head (ASPPHead) to accomplish seamless multi-scale feature extraction. Our experiments are compared and analyzed with advanced models on three public datasets, and it can be concluded that the universality and accuracy of Multi-UNet network are superior.
Keyword :
Multiple U-shaped network Multiple U-shaped network Semantic segmentation Semantic segmentation U-net U-net
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhao, Qiangwei , Cao, Jingjing , Ge, Junjie et al. Multi-UNet: An effective Multi-U convolutional networks for semantic segmentation [J]. | KNOWLEDGE-BASED SYSTEMS , 2025 , 309 . |
MLA | Zhao, Qiangwei et al. "Multi-UNet: An effective Multi-U convolutional networks for semantic segmentation" . | KNOWLEDGE-BASED SYSTEMS 309 (2025) . |
APA | Zhao, Qiangwei , Cao, Jingjing , Ge, Junjie , Zhu, Qi , Chen, Xiaoming , Liu, Wenxi . Multi-UNet: An effective Multi-U convolutional networks for semantic segmentation . | KNOWLEDGE-BASED SYSTEMS , 2025 , 309 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Diagnostic pathology, historically dependent on visual scrutiny by experts, is essential for disease detection. Advances in digital pathology and developments in computer vision technology have led to the application of artificial intelligence (AI) in this field. Despite these advancements, the variability in pathologists' subjective interpretations of diagnostic criteria can lead to inconsistent outcomes. To meet the need for precision in cancer therapies, there is an increasing demand for accurate pathological diagnoses. Consequently, traditional diagnostic pathology is evolving towards "next-generation diagnostic pathology", prioritizing on the development of a multi-dimensional, intelligent diagnostic approach. Using nonlinear optical effects arising from the interaction of light with biological tissues, multiphoton microscopy (MPM) enables high-resolution label-free imaging of multiple intrinsic components across various human pathological tissues. AI-empowered MPM further improves the accuracy and efficiency of diagnosis, holding promise for providing auxiliary pathology diagnostic methods based on multiphoton diagnostic criteria. In this review, we systematically outline the applications of MPM in pathological diagnosis across various human diseases, and summarize common multiphoton diagnostic features. Moreover, we examine the significant role of AI in enhancing multiphoton pathological diagnosis, including aspects such as image preprocessing, refined differential diagnosis, and the prognostication of outcomes. We also discuss the challenges and perspectives faced by the integration of MPM and AI, encompassing equipment, datasets, analytical models, and integration into the existing clinical pathways. Finally, the review explores the synergy between AI and label-free MPM to forge novel diagnostic frameworks, aiming to accelerate the adoption and implementation of intelligent multiphoton pathology systems in clinical settings. AI-empowered multiphoton microscopy enhances diagnostic accuracy and efficiency for various human diseases, evolving towards next-generation diagnostic pathology with an endogenous, multi-dimensional, and intelligent approach.
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wang, Shu , Pan, Junlin , Zhang, Xiao et al. Towards next-generation diagnostic pathology: AI-empowered label-free multiphoton microscopy [J]. | LIGHT-SCIENCE & APPLICATIONS , 2024 , 13 (1) . |
MLA | Wang, Shu et al. "Towards next-generation diagnostic pathology: AI-empowered label-free multiphoton microscopy" . | LIGHT-SCIENCE & APPLICATIONS 13 . 1 (2024) . |
APA | Wang, Shu , Pan, Junlin , Zhang, Xiao , Li, Yueying , Liu, Wenxi , Lin, Ruolan et al. Towards next-generation diagnostic pathology: AI-empowered label-free multiphoton microscopy . | LIGHT-SCIENCE & APPLICATIONS , 2024 , 13 (1) . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
In recent years, the rapid advancement of image generation techniques has resulted in the widespread abuse of manipulated images, leading to a crisis of trust and affecting social equity. Thus, the goal of our work is to detect and localize tampered regions in images. Many deep learning based approaches have been proposed to address this problem, but they can hardly handle the tampered regions that are manually fine-tuned to blend into image background. By observing that the boundaries of tempered regions are critical to separating tampered and non-tampered parts, we present a novel boundary-guided approach to image manipulation detection, which introduces an inherent bias towards exploiting the boundary information of tampered regions. Our model follows an encoder-decoder architecture, with multi-scale localization mask prediction, and is guided to utilize the prior boundary knowledge through an attention mechanism and contrastive learning. In particular, our model is unique in that 1) we propose a boundary-aware attention module in the network decoder, which predicts the boundary of tampered regions and thus uses it as crucial contextual cues to facilitate the localization; and 2) we propose a multi-scale contrastive learning scheme with a novel boundary-guided sampling strategy, leading to more discriminative localization features. Our state-of-art performance on several public benchmarks demonstrates the superiority of our model over prior works.
Keyword :
Contrastive learning Contrastive learning Decoding Decoding Deepfakes Deepfakes Feature extraction Feature extraction Image manipulation detection/localization Image manipulation detection/localization Location awareness Location awareness Task analysis Task analysis Visualization Visualization
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, Wenxi , Zhang, Hao , Lin, Xinyang et al. Attentive and Contrastive Image Manipulation Localization With Boundary Guidance [J]. | IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY , 2024 , 19 : 6764-6778 . |
MLA | Liu, Wenxi et al. "Attentive and Contrastive Image Manipulation Localization With Boundary Guidance" . | IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 19 (2024) : 6764-6778 . |
APA | Liu, Wenxi , Zhang, Hao , Lin, Xinyang , Zhang, Qing , Li, Qi , Liu, Xiaoxiang et al. Attentive and Contrastive Image Manipulation Localization With Boundary Guidance . | IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY , 2024 , 19 , 6764-6778 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
HD map reconstruction is crucial for autonomous driving. LiDAR-based methods are limited due to expensive sensors and time-consuming computation. Camera-based methods usually need to perform road segmentation and view transformation separately, which often causes distortion and missing content. To push the limits of the technology, we present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view given a front-view monocular image only. We propose a front-to-top view projection (FTVP) module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen the view transformation and scene understanding. In addition, we apply multi-scale FTVP modules to propagate the rich spatial information of low-level features to mitigate spatial deviation of the predicted object location. Experiments on public benchmarks show that our method achieves various tasks on road layout estimation, vehicle occupancy estimation, and multi-class semantic estimation, at a performance level comparable to the state-of-the-arts, while maintaining superior efficiency.
Keyword :
autonomous driving autonomous driving BEV perception BEV perception segmentation segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, Wenxi , Li, Qi , Yang, Weixiang et al. Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [J]. | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2024 , 46 (9) : 6109-6125 . |
MLA | Liu, Wenxi et al. "Monocular BEV Perception of Road Scenes via Front-to-Top View Projection" . | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 46 . 9 (2024) : 6109-6125 . |
APA | Liu, Wenxi , Li, Qi , Yang, Weixiang , Cai, Jiaxin , Yu, Yuanlong , Ma, Yuexin et al. Monocular BEV Perception of Road Scenes via Front-to-Top View Projection . | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2024 , 46 (9) , 6109-6125 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Nuclei segmentation and classification play a crucial role in pathology diagnosis, enabling pathologists to analyze cellular characteristics accurately. Overlapping cluster nuclei, misdetection of small-scale nuclei, and pleomorphic nuclei-induced misclassification have always been major challenges in the nuclei segmentation and classification tasks. To this end, we introduce an auxiliary task of nuclei boundary-guided contrastive learning to enhance the representativeness and discriminative power of visual features, particularly for addressing the challenge posed by the unclear contours of adherent nuclei and small nuclei. In addition, misclassifications resulting from pleomorphic nuclei often exhibit low classification confidence, indicating a high level of uncertainty. To mitigate misclassification, we capitalize on the characteristic clustering of similar cells to propose a locality-aware class embedding module, offering a regional perspective to capture category information. Moreover, we address uncertain classification in densely aggregated nuclei by designing a top-k uncertainty attention module that leverages deep features to enhance shallow features, thereby improving the learning of contextual semantic information. We demonstrate that the proposed network outperforms the off-the-shelf methods in both nuclei segmentation and classification experiments, achieving the state-of-the-art performance. © 2024 Elsevier Ltd
Keyword :
Classification (of information) Classification (of information) Computer aided diagnosis Computer aided diagnosis Deep learning Deep learning Image classification Image classification Semantics Semantics Semantic Segmentation Semantic Segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, Wenxi , Zhang, Qing , Li, Qi et al. Contrastive and uncertainty-aware nuclei segmentation and classification [J]. | Computers in Biology and Medicine , 2024 , 178 . |
MLA | Liu, Wenxi et al. "Contrastive and uncertainty-aware nuclei segmentation and classification" . | Computers in Biology and Medicine 178 (2024) . |
APA | Liu, Wenxi , Zhang, Qing , Li, Qi , Wang, Shu . Contrastive and uncertainty-aware nuclei segmentation and classification . | Computers in Biology and Medicine , 2024 , 178 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Semantic segmentation is one of the directions in image research. It aims to obtain the contours of objects of interest, facilitating subsequent engineering tasks such as measurement and feature selection. However, existing segmentation methods still lack precision in class edge, particularly in multi -class mixed region. To this end, we present the Feature Enhancement Network (FE -Net), a novel approach that leverages edge label and pixel -wise weights to enhance segmentation performance in complex backgrounds. Firstly, we propose a Smart Edge Head (SE -Head) to process shallow -level information from the backbone network. It is combined with the FCN-Head and SepASPP-Head, located at deeper layers, to form a transitional structure where the loss weights gradually transition from edge labels to semantic labels and a mixed loss is also designed to support this structure. Additionally, we propose a pixel -wise weight evaluation method, a pixel -wise weight block, and a feature enhancement loss to improve training effectiveness in multi -class regions. FE -Net achieves significant performance improvements over baselines on publicly datasets Pascal VOC2012, SBD, and ATR, with best mIoU enhancements of 15.19%, 1.42% and 3.51%, respectively. Furthermore, experiments conducted on Pole&Hole match dataset from our laboratory environment demonstrate the superior effectiveness of FE -Net in segmenting defined key pixels.
Keyword :
Edge label Edge label Key pixels Key pixels Multi-class mixed region Multi-class mixed region Pixel-wise weight Pixel-wise weight Semantic segmentation Semantic segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhao, Zhangyan , Chen, Xiaoming , Cao, Jingjing et al. FE-Net: Feature enhancement segmentation network [J]. | NEURAL NETWORKS , 2024 , 174 . |
MLA | Zhao, Zhangyan et al. "FE-Net: Feature enhancement segmentation network" . | NEURAL NETWORKS 174 (2024) . |
APA | Zhao, Zhangyan , Chen, Xiaoming , Cao, Jingjing , Zhao, Qiangwei , Liu, Wenxi . FE-Net: Feature enhancement segmentation network . | NEURAL NETWORKS , 2024 , 174 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Ultra-high resolution image segmentation poses a formidable challenge for UAVs with limited computation resources. Moreover, with multiple deployed tasks (e.g., mapping, localization, and decision making), the demand for a memory efficient model becomes more urgent. This letter delves into the intricate problem of achieving efficient and effective segmentation of ultra-high resolution UAV imagery, while operating under stringent GPU memory limitation. To address this problem, we propose a GPU memory-efficient and effective framework. Specifically, we introduce a novel and efficient spatial-guided high-resolution query module, which enables our model to effectively infer pixel-wise segmentation results by querying nearest latent embeddings from low-resolution features. Additionally, we present a memory-based interaction scheme with linear complexity to rectify semantic bias beneath the high-resolution spatial guidance via associating cross-image contextual semantics. For evaluation, we perform comprehensive experiments over public benchmarks under both conditions of small and large GPU memory usage limitations. Notably, our model gains around 3% advantage against SOTA in mIoU using comparable memory. Furthermore, we show that our model can be deployed on the embedded platform with less than 8 G memory like Jetson TX2.
Keyword :
Aerial Systems: Perception and Autonomy Aerial Systems: Perception and Autonomy Autonomous aerial vehicles Autonomous aerial vehicles Deep Learning for Visual Perception Deep Learning for Visual Perception Graphics processing units Graphics processing units Image resolution Image resolution Memory management Memory management Semantics Semantics Semantic segmentation Semantic segmentation Spatial resolution Spatial resolution
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Li, Qi , Cai, Jiaxin , Luo, Jiexin et al. Memory-Constrained Semantic Segmentation for Ultra-High Resolution UAV Imagery [J]. | IEEE ROBOTICS AND AUTOMATION LETTERS , 2024 , 9 (2) : 1708-1715 . |
MLA | Li, Qi et al. "Memory-Constrained Semantic Segmentation for Ultra-High Resolution UAV Imagery" . | IEEE ROBOTICS AND AUTOMATION LETTERS 9 . 2 (2024) : 1708-1715 . |
APA | Li, Qi , Cai, Jiaxin , Luo, Jiexin , Yu, Yuanlong , Gu, Jason , Pan, Jia et al. Memory-Constrained Semantic Segmentation for Ultra-High Resolution UAV Imagery . | IEEE ROBOTICS AND AUTOMATION LETTERS , 2024 , 9 (2) , 1708-1715 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Ultra-high resolution image segmentation has raised increasing interests in recent years due to its realistic applications. In this paper, we innovate the widely used high-resolution image segmentation pipeline, in which an ultra-high resolution image is partitioned into regular patches for local segmentation and then the local results are merged into a high-resolution semantic mask. In particular, we introduce a novel locality-aware context fusion based segmentation model to process local patches, where the relevance between local patch and its various contexts are jointly and complementarily utilized to handle the semantic regions with large variations. Additionally, we present the alternating local enhancement module that restricts the negative impact of redundant information introduced from the contexts, and thus is endowed with the ability of fixing the locality-aware features to produce refined results. Furthermore, in comprehensive experiments, we demonstrate that our model outperforms other state-of-the-art methods in public benchmarks and verify the effectiveness of the proposed modules. Our released codes will be available at: https://github.com/liqiokkk/FCtL.
Keyword :
Attention mechanism Attention mechanism Context-guided vision model Context-guided vision model Geo-spatial image segmentation Geo-spatial image segmentation Ultra-high resolution image segmentation Ultra-high resolution image segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, Wenxi , Li, Qi , Lin, Xindai et al. Ultra-High Resolution Image Segmentation via Locality-Aware Context Fusion and Alternating Local Enhancement [J]. | INTERNATIONAL JOURNAL OF COMPUTER VISION , 2024 , 132 (11) : 5030-5047 . |
MLA | Liu, Wenxi et al. "Ultra-High Resolution Image Segmentation via Locality-Aware Context Fusion and Alternating Local Enhancement" . | INTERNATIONAL JOURNAL OF COMPUTER VISION 132 . 11 (2024) : 5030-5047 . |
APA | Liu, Wenxi , Li, Qi , Lin, Xindai , Yang, Weixiang , He, Shengfeng , Yu, Yuanlong . Ultra-High Resolution Image Segmentation via Locality-Aware Context Fusion and Alternating Local Enhancement . | INTERNATIONAL JOURNAL OF COMPUTER VISION , 2024 , 132 (11) , 5030-5047 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Domain Generalization (DG) aims to generalize a model trained on multiple source domains to an unseen target domain. The source domains always require precise annotations, which can be cumbersome or even infeasible to obtain in practice due to the vast amount of data involved. Web data, namely web -crawled images, offers an opportunity to access large amounts of unlabeled images with rich style information, which can be leveraged to improve DG. From this perspective, we introduce a novel paradigm of DG, termed as Semi -Supervised Domain Generalization (SSDG), to explore how the labeled and unlabeled source domains can interact, and establish two settings, including the close -set and open -set SSDG. The close -set SSDG is based on existing public DG datasets, while the open -set SSDG, built on the newly -collected web -crawled datasets, presents a novel yet realistic challenge that pushes the limits of current technologies. A natural approach of SSDG is to transfer knowledge from labeled data to unlabeled data via pseudo labeling, and train the model on both labeled and pseudo -labeled data for generalization. Since there are conflicting goals between domain -oriented pseudo labeling and out -of -domain generalization, we develop a pseudo labeling phase and a generalization phase independently for SSDG. Unfortunately, due to the large domain gap, the pseudo labels provided in the pseudo labeling phase inevitably contain noise, which has negative affect on the subsequent generalization phase. Therefore, to improve the quality of pseudo labels and further enhance generalizability, we propose a cyclic learning framework to encourage a positive feedback between these two phases, utilizing an evolving intermediate domain that bridges the labeled and unlabeled domains in a curriculum learning manner. Extensive experiments are conducted to validate the effectiveness of our method. It is worth highlighting that web -crawled images can promote domain generalization as demonstrated by the experimental results.
Keyword :
Domain generalization Domain generalization Semi-supervised learning Semi-supervised learning Transfer learning Transfer learning Unsupervised domain adaptation Unsupervised domain adaptation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lin, Luojun , Xie, Han , Sun, Zhishu et al. Semi-supervised domain generalization with evolving intermediate domain [J]. | PATTERN RECOGNITION , 2024 , 149 . |
MLA | Lin, Luojun et al. "Semi-supervised domain generalization with evolving intermediate domain" . | PATTERN RECOGNITION 149 (2024) . |
APA | Lin, Luojun , Xie, Han , Sun, Zhishu , Chen, Weijie , Liu, Wenxi , Yu, Yuanlong et al. Semi-supervised domain generalization with evolving intermediate domain . | PATTERN RECOGNITION , 2024 , 149 . |
Export to | NoteExpress RIS BibTex |
Version :
Export
Results: |
Selected to |
Format: |