Query:
学者姓名:刘文犀
Refining:
Year
Type
Indexed by
Source
Complex
Former Name
Co-
Language
Clean All
Abstract :
U-Net is a classic architecture for semantic segmentation. However, it has several limitations, such as difficulty in capturing complex images detail due to its simple U structure, long convergence time arising from fixed network parameters, and suboptimal efficacy in decoding and restoring multi-scale information. To deal with the above issues, we propose a Multiple U-shaped network (Multi-UNet) assuming that constructing appropriate U-shaped structure can achieve better segmentation performance. Firstly, inspired by the concept of connecting multiple similar blocks, our Multi-UNet consists of multiple U-block modules, with each succeeding module directly connected to the previous one to facilitate data transmission between different U structures. We refer to the original bridge connections of U-Net as Intra-U connections and introduce a new type of connection called Inter-U connections. These Inter-U connections aim to retain as much detailed information as possible, enabling effective detection of complex images. Secondly, while maintaining Mean Intersection over Union (Mean-IoU), the up-sampling of each U applies uniformly small channel values to reduce the number of model parameters. Thirdly, a Spatial-Channel Parallel Attention Fusion (SCPAF) module is designed at the initial layer of every subsampling module of U-block architecture. It enhances feature extraction and alleviate computational overhead associated with data transmission. Finally, we replace the final up-sampling module with Atrous Spatial Pyramid Pooling Head (ASPPHead) to accomplish seamless multi-scale feature extraction. Our experiments are compared and analyzed with advanced models on three public datasets, and it can be concluded that the universality and accuracy of Multi-UNet network are superior.
Keyword :
Multiple U-shaped network Multiple U-shaped network Semantic segmentation Semantic segmentation U-net U-net
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhao, Qiangwei , Cao, Jingjing , Ge, Junjie et al. Multi-UNet: An effective Multi-U convolutional networks for semantic segmentation [J]. | KNOWLEDGE-BASED SYSTEMS , 2025 , 309 . |
MLA | Zhao, Qiangwei et al. "Multi-UNet: An effective Multi-U convolutional networks for semantic segmentation" . | KNOWLEDGE-BASED SYSTEMS 309 (2025) . |
APA | Zhao, Qiangwei , Cao, Jingjing , Ge, Junjie , Zhu, Qi , Chen, Xiaoming , Liu, Wenxi . Multi-UNet: An effective Multi-U convolutional networks for semantic segmentation . | KNOWLEDGE-BASED SYSTEMS , 2025 , 309 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Diagnostic pathology, historically dependent on visual scrutiny by experts, is essential for disease detection. Advances in digital pathology and developments in computer vision technology have led to the application of artificial intelligence (AI) in this field. Despite these advancements, the variability in pathologists’ subjective interpretations of diagnostic criteria can lead to inconsistent outcomes. To meet the need for precision in cancer therapies, there is an increasing demand for accurate pathological diagnoses. Consequently, traditional diagnostic pathology is evolving towards “next-generation diagnostic pathology”, prioritizing on the development of a multi-dimensional, intelligent diagnostic approach. Using nonlinear optical effects arising from the interaction of light with biological tissues, multiphoton microscopy (MPM) enables high-resolution label-free imaging of multiple intrinsic components across various human pathological tissues. AI-empowered MPM further improves the accuracy and efficiency of diagnosis, holding promise for providing auxiliary pathology diagnostic methods based on multiphoton diagnostic criteria. In this review, we systematically outline the applications of MPM in pathological diagnosis across various human diseases, and summarize common multiphoton diagnostic features. Moreover, we examine the significant role of AI in enhancing multiphoton pathological diagnosis, including aspects such as image preprocessing, refined differential diagnosis, and the prognostication of outcomes. We also discuss the challenges and perspectives faced by the integration of MPM and AI, encompassing equipment, datasets, analytical models, and integration into the existing clinical pathways. Finally, the review explores the synergy between AI and label-free MPM to forge novel diagnostic frameworks, aiming to accelerate the adoption and implementation of intelligent multiphoton pathology systems in clinical settings. © The Author(s) 2024.
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wang, S. , Pan, J. , Zhang, X. et al. Towards next-generation diagnostic pathology: AI-empowered label-free multiphoton microscopy [J]. | Light: Science and Applications , 2024 , 13 (1) . |
MLA | Wang, S. et al. "Towards next-generation diagnostic pathology: AI-empowered label-free multiphoton microscopy" . | Light: Science and Applications 13 . 1 (2024) . |
APA | Wang, S. , Pan, J. , Zhang, X. , Li, Y. , Liu, W. , Lin, R. et al. Towards next-generation diagnostic pathology: AI-empowered label-free multiphoton microscopy . | Light: Science and Applications , 2024 , 13 (1) . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Nighttime semantic segmentation is an important but challenging research problem for autonomous driving. The major challenges lie in the small objects or regions from the under-/over-exposed areas or suffer from motion blur caused by the camera deployed on moving vehicles. To resolve this, we propose a novel hard-class-aware module that bridges the main network for full-class segmentation and the hard-class network for segmenting aforementioned hard-class objects. In specific, it exploits the shared focus of hard-class objects from the dual-stream network, enabling the contextual information flow to guide the model to concentrate on the pixels that are hard to classify. In the end, the estimated hard-class segmentation results will be utilized to infer the final results via an adaptive probabilistic fusion refinement scheme. Moreover, to overcome over-smoothing and noise caused by extreme exposures, our model is modulated by a carefully crafted pretext task of constructing an exposure-aware semantic gradient map, which guides the model to faithfully perceive the structural and semantic information of hard-class objects while mitigating the negative impact of noises and uneven exposures. In experiments, we demonstrate that our unique network design leads to superior segmentation performance over existing methods, featuring the strong ability of perceiving hard-class objects under adverse conditions. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Keyword :
Classification (of information) Classification (of information) Semantics Semantics Semantic Segmentation Semantic Segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, Wenxi , Cai, Jiaxin , Li, Qi et al. Learning Nighttime Semantic Segmentation the Hard Way [J]. | ACM Transactions on Multimedia Computing, Communications and Applications , 2024 , 20 (7) . |
MLA | Liu, Wenxi et al. "Learning Nighttime Semantic Segmentation the Hard Way" . | ACM Transactions on Multimedia Computing, Communications and Applications 20 . 7 (2024) . |
APA | Liu, Wenxi , Cai, Jiaxin , Li, Qi , Liao, Chenyang , Cao, Jingjing , He, Shengfeng et al. Learning Nighttime Semantic Segmentation the Hard Way . | ACM Transactions on Multimedia Computing, Communications and Applications , 2024 , 20 (7) . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Ultra-high resolution image segmentation has raised increasing interests in recent years due to its realistic applications. In this paper, we innovate the widely used high-resolution image segmentation pipeline, in which an ultra-high resolution image is partitioned into regular patches for local segmentation and then the local results are merged into a high-resolution semantic mask. In particular, we introduce a novel locality-aware context fusion based segmentation model to process local patches, where the relevance between local patch and its various contexts are jointly and complementarily utilized to handle the semantic regions with large variations. Additionally, we present the alternating local enhancement module that restricts the negative impact of redundant information introduced from the contexts, and thus is endowed with the ability of fixing the locality-aware features to produce refined results. Furthermore, in comprehensive experiments, we demonstrate that our model outperforms other state-of-the-art methods in public benchmarks and verify the effectiveness of the proposed modules. Our released codes will be available at: https://github.com/liqiokkk/FCtL. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
Keyword :
Attention mechanism Attention mechanism Context-guided vision model Context-guided vision model Geo-spatial image segmentation Geo-spatial image segmentation Ultra-high resolution image segmentation Ultra-high resolution image segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, W. , Li, Q. , Lin, X. et al. Ultra-High Resolution Image Segmentation via Locality-Aware Context Fusion and Alternating Local Enhancement [J]. | International Journal of Computer Vision , 2024 , 132 (11) : 5030-5047 . |
MLA | Liu, W. et al. "Ultra-High Resolution Image Segmentation via Locality-Aware Context Fusion and Alternating Local Enhancement" . | International Journal of Computer Vision 132 . 11 (2024) : 5030-5047 . |
APA | Liu, W. , Li, Q. , Lin, X. , Yang, W. , He, S. , Yu, Y. . Ultra-High Resolution Image Segmentation via Locality-Aware Context Fusion and Alternating Local Enhancement . | International Journal of Computer Vision , 2024 , 132 (11) , 5030-5047 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
HD map reconstruction is crucial for autonomous driving. LiDAR-based methods are limited due to expensive sensors and time-consuming computation. Camera-based methods usually need to perform road segmentation and view transformation separately, which often causes distortion and missing content. To push the limits of the technology, we present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view given a front-view monocular image only. We propose a front-to-top view projection (FTVP) module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen the view transformation and scene understanding. In addition, we apply multi-scale FTVP modules to propagate the rich spatial information of low-level features to mitigate spatial deviation of the predicted object location. Experiments on public benchmarks show that our method achieves various tasks on road layout estimation, vehicle occupancy estimation, and multi-class semantic estimation, at a performance level comparable to the state-of-the-arts, while maintaining superior efficiency. IEEE
Keyword :
Autonomous driving Autonomous driving BEV perception BEV perception Estimation Estimation Feature extraction Feature extraction Layout Layout Roads Roads segmentation segmentation Task analysis Task analysis Three-dimensional displays Three-dimensional displays Transformers Transformers
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, W. , Li, Q. , Yang, W. et al. Monocular BEV Perception of Road Scenes Via Front-to-Top View Projection [J]. | IEEE Transactions on Pattern Analysis and Machine Intelligence , 2024 , 46 (9) : 1-17 . |
MLA | Liu, W. et al. "Monocular BEV Perception of Road Scenes Via Front-to-Top View Projection" . | IEEE Transactions on Pattern Analysis and Machine Intelligence 46 . 9 (2024) : 1-17 . |
APA | Liu, W. , Li, Q. , Yang, W. , Cai, J. , Yu, Y. , Ma, Y. et al. Monocular BEV Perception of Road Scenes Via Front-to-Top View Projection . | IEEE Transactions on Pattern Analysis and Machine Intelligence , 2024 , 46 (9) , 1-17 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
The problem of video demoireing is a new challenge in video restoration. Unlike image demoireing, which involves removing static and uniform patterns, video demoireing requires tackling dynamic and varied moire patterns while maintaining video details, colors, and temporal consistency. It is particularly challenging to model moire patterns for videos with camera or object motions, where separating moire from the original video content across frames is extremely difficult. Nonetheless, we observe that the spatial distribution of moire patterns is often sparse on each frame, and their long-range temporal correlation is not significant. To fully leverage this phenomenon, a sparsity-constrained spatial self-attention scheme is proposed to concentrate on removing sparse moire efficiently for each frame without being distracted by dynamic video content. The frame-wise spatial features are then correlated and aggregated via the local temporal cross-frame-attention module to produce temporal-consistent high-quality moire-free videos. The above decoupled spatial and temporal transformers constitute the Spatio-Temporal Decomposition Network, dubbed STD-Net. For evaluation, we present a large-scale video demoireing benchmark featuring various real-life scenes, camera motions, and object motions. We demonstrate that our proposed model can effectively and efficiently achieve superior performance on video demoireing and single image demoireing tasks. The proposed dataset is released at https://github.com/FZU-N/LVDM.
Keyword :
Image restoration Image restoration sparse transformer sparse transformer spatio-temporal network spatio-temporal network video demoireing video demoireing video restoration video restoration
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Niu, Yuzhen , Xu, Rui , Lin, Zhihua et al. STD-Net: Spatio-Temporal Decomposition Network for Video Demoiring With Sparse Transformers [J]. | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2024 , 34 (9) : 8562-8575 . |
MLA | Niu, Yuzhen et al. "STD-Net: Spatio-Temporal Decomposition Network for Video Demoiring With Sparse Transformers" . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 34 . 9 (2024) : 8562-8575 . |
APA | Niu, Yuzhen , Xu, Rui , Lin, Zhihua , Liu, Wenxi . STD-Net: Spatio-Temporal Decomposition Network for Video Demoiring With Sparse Transformers . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2024 , 34 (9) , 8562-8575 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Diagnostic pathology,historically dependent on visual scrutiny by experts,is essential for disease detection.Advances in digital pathology and developments in computer vision technology have led to the application of artificial intelligence(AI)in this field.Despite these advancements,the variability in pathologists'subjective interpretations of diagnostic criteria can lead to inconsistent outcomes.To meet the need for precision in cancer therapies,there is an increasing demand for accurate pathological diagnoses.Consequently,traditional diagnostic pathology is evolving towards"next-generation diagnostic pathology",prioritizing on the development of a multi-dimensional,intelligent diagnostic approach.Using nonlinear optical effects arising from the interaction of light with biological tissues,multiphoton microscopy(MPM)enables high-resolution label-free imaging of multiple intrinsic components across various human pathological tissues.Al-empowered MPM further improves the accuracy and efficiency of diagnosis,holding promise for providing auxiliary pathology diagnostic methods based on multiphoton diagnostic criteria.In this review,we systematically outline the applications of MPM in pathological diagnosis across various human diseases,and summarize common multiphoton diagnostic features.Moreover,we examine the significant role of Al in enhancing multiphoton pathological diagnosis,including aspects such as image preprocessing,refined differential diagnosis,and the prognostication of outcomes.We also discuss the challenges and perspectives faced by the integration of MPM and AI,encompassing equipment,datasets,analytical models,and integration into the existing clinical pathways.Finally,the review explores the synergy between AI and label-free MPM to forge novel diagnostic frameworks,aiming to accelerate the adoption and implementation of intelligent multiphoton pathology systems in clinical settings.
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Shu Wang , Junlin Pan , Xiao Zhang et al. Towards next-generation diagnostic pathology:AI-empowered label-free multiphoton microscopy [J]. | 光:科学与应用(英文版) , 2024 , 13 (12) : 2887-2911 . |
MLA | Shu Wang et al. "Towards next-generation diagnostic pathology:AI-empowered label-free multiphoton microscopy" . | 光:科学与应用(英文版) 13 . 12 (2024) : 2887-2911 . |
APA | Shu Wang , Junlin Pan , Xiao Zhang , Yueying Li , Wenxi Liu , Ruolan Lin et al. Towards next-generation diagnostic pathology:AI-empowered label-free multiphoton microscopy . | 光:科学与应用(英文版) , 2024 , 13 (12) , 2887-2911 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Generic multiple object tracking aims to recover the trajectories for generic moving objects of the same category. This task relies on the ability of effectively extracting representative features of the target objects. To this end, we propose a novel prototype learning based model, PLGMOT, that can explore the template features of an exemplar object and extend to more objects to acquire their prototype. Their prototype features can be continuously updated during the video, in favor of generalization to all the target objects with different appearances. More importantly, on the public benchmark GMOT-40, our method achieves more than 14% advantage over the state -of -the -art methods, with less than 0.5% of the training data that is not even completely annotated in the form of bounding boxes, thanks to our proposed point -to -box label refinement training algorithm and hierarchical motion-aware association algorithm.
Keyword :
Deep learning Deep learning Generic multiple object tracking Generic multiple object tracking Multiple object tracking Multiple object tracking Object detection Object detection Prototype learning Prototype learning
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, Wenxi , Lin, Yuhao , Li, Qi et al. Prototype learning based generic multiple object tracking via point-to-box supervision [J]. | PATTERN RECOGNITION , 2024 , 154 . |
MLA | Liu, Wenxi et al. "Prototype learning based generic multiple object tracking via point-to-box supervision" . | PATTERN RECOGNITION 154 (2024) . |
APA | Liu, Wenxi , Lin, Yuhao , Li, Qi , She, Yinhua , Yu, Yuanlong , Pan, Jia et al. Prototype learning based generic multiple object tracking via point-to-box supervision . | PATTERN RECOGNITION , 2024 , 154 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Domain Generalization (DG) aims to generalize a model trained on multiple source domains to an unseen target domain. The source domains always require precise annotations, which can be cumbersome or even infeasible to obtain in practice due to the vast amount of data involved. Web data, namely web -crawled images, offers an opportunity to access large amounts of unlabeled images with rich style information, which can be leveraged to improve DG. From this perspective, we introduce a novel paradigm of DG, termed as Semi -Supervised Domain Generalization (SSDG), to explore how the labeled and unlabeled source domains can interact, and establish two settings, including the close -set and open -set SSDG. The close -set SSDG is based on existing public DG datasets, while the open -set SSDG, built on the newly -collected web -crawled datasets, presents a novel yet realistic challenge that pushes the limits of current technologies. A natural approach of SSDG is to transfer knowledge from labeled data to unlabeled data via pseudo labeling, and train the model on both labeled and pseudo -labeled data for generalization. Since there are conflicting goals between domain -oriented pseudo labeling and out -of -domain generalization, we develop a pseudo labeling phase and a generalization phase independently for SSDG. Unfortunately, due to the large domain gap, the pseudo labels provided in the pseudo labeling phase inevitably contain noise, which has negative affect on the subsequent generalization phase. Therefore, to improve the quality of pseudo labels and further enhance generalizability, we propose a cyclic learning framework to encourage a positive feedback between these two phases, utilizing an evolving intermediate domain that bridges the labeled and unlabeled domains in a curriculum learning manner. Extensive experiments are conducted to validate the effectiveness of our method. It is worth highlighting that web -crawled images can promote domain generalization as demonstrated by the experimental results.
Keyword :
Domain generalization Domain generalization Semi-supervised learning Semi-supervised learning Transfer learning Transfer learning Unsupervised domain adaptation Unsupervised domain adaptation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lin, Luojun , Xie, Han , Sun, Zhishu et al. Semi-supervised domain generalization with evolving intermediate domain [J]. | PATTERN RECOGNITION , 2024 , 149 . |
MLA | Lin, Luojun et al. "Semi-supervised domain generalization with evolving intermediate domain" . | PATTERN RECOGNITION 149 (2024) . |
APA | Lin, Luojun , Xie, Han , Sun, Zhishu , Chen, Weijie , Liu, Wenxi , Yu, Yuanlong et al. Semi-supervised domain generalization with evolving intermediate domain . | PATTERN RECOGNITION , 2024 , 149 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Semantic segmentation is one of the directions in image research. It aims to obtain the contours of objects of interest, facilitating subsequent engineering tasks such as measurement and feature selection. However, existing segmentation methods still lack precision in class edge, particularly in multi -class mixed region. To this end, we present the Feature Enhancement Network (FE -Net), a novel approach that leverages edge label and pixel -wise weights to enhance segmentation performance in complex backgrounds. Firstly, we propose a Smart Edge Head (SE -Head) to process shallow -level information from the backbone network. It is combined with the FCN-Head and SepASPP-Head, located at deeper layers, to form a transitional structure where the loss weights gradually transition from edge labels to semantic labels and a mixed loss is also designed to support this structure. Additionally, we propose a pixel -wise weight evaluation method, a pixel -wise weight block, and a feature enhancement loss to improve training effectiveness in multi -class regions. FE -Net achieves significant performance improvements over baselines on publicly datasets Pascal VOC2012, SBD, and ATR, with best mIoU enhancements of 15.19%, 1.42% and 3.51%, respectively. Furthermore, experiments conducted on Pole&Hole match dataset from our laboratory environment demonstrate the superior effectiveness of FE -Net in segmenting defined key pixels.
Keyword :
Edge label Edge label Key pixels Key pixels Multi-class mixed region Multi-class mixed region Pixel-wise weight Pixel-wise weight Semantic segmentation Semantic segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhao, Zhangyan , Chen, Xiaoming , Cao, Jingjing et al. FE-Net: Feature enhancement segmentation network [J]. | NEURAL NETWORKS , 2024 , 174 . |
MLA | Zhao, Zhangyan et al. "FE-Net: Feature enhancement segmentation network" . | NEURAL NETWORKS 174 (2024) . |
APA | Zhao, Zhangyan , Chen, Xiaoming , Cao, Jingjing , Zhao, Qiangwei , Liu, Wenxi . FE-Net: Feature enhancement segmentation network . | NEURAL NETWORKS , 2024 , 174 . |
Export to | NoteExpress RIS BibTex |
Version :
Export
Results: |
Selected to |
Format: |