Query:
学者姓名:杨文杰
Refining:
Year
Type
Indexed by
Source
Complex
Co-
Language
Clean All
Abstract :
A key challenge in reinforcement learning is how to guide agents to efficiently explore sparse reward environments. In order to overcome this challenge, the state-of-the-art methods introduce additional intrinsic rewards based on state-related information, such as the novelty of states. Unfortunately, these methods frequently fail in procedurally-generated tasks, where a different environment is generated in each episode so that the agent is not likely to visit the same state more than once. Recently, some exploration methods designed specifically for procedurally-generated tasks have been proposed. However, they still only consider state-related information, which leads to relatively inefficient exploration. In this work, we propose a novel exploration method, which utilizes cross-episode policy-related information and intraepisode state-related information to jointly encourage exploration in procedurally-generated tasks. In term of policy-related information, we first use an imitator-based unbalanced policy diversity to measure the difference between the agent's current policy and the agent's previous policies, and then encourage the agent to maximize this difference. In term of state-related information, we encourage the agent to maximize the state diversity within an episode, thereby visiting as many different states as possible in an episode. We show that our method significantly improves sample efficiency over state-of-the-art methods on three challenging benchmarks, including MiniGrid, MiniWorld, and the sparse-reward version of Procgen.
Keyword :
Benchmark testing Benchmark testing Current measurement Current measurement Cybernetics Cybernetics Deep reinforcement learning Deep reinforcement learning Diversity reception Diversity reception exploration exploration Faces Faces Optimization Optimization procedurally-generated task procedurally-generated task Q-learning Q-learning sparse reward sparse reward Three-dimensional displays Three-dimensional displays Training Training
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Xu, Pei , Chen, Hao , Yang, Wenjie et al. Exploration via Embracing Diversity in Reinforcement Learning for Sparse-Reward Procedurally-Generated Tasks [J]. | IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS , 2025 , 55 (9) : 5776-5789 . |
MLA | Xu, Pei et al. "Exploration via Embracing Diversity in Reinforcement Learning for Sparse-Reward Procedurally-Generated Tasks" . | IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 55 . 9 (2025) : 5776-5789 . |
APA | Xu, Pei , Chen, Hao , Yang, Wenjie , Huang, Kaiqi . Exploration via Embracing Diversity in Reinforcement Learning for Sparse-Reward Procedurally-Generated Tasks . | IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS , 2025 , 55 (9) , 5776-5789 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
In medical intelligence applications, the labeling of medical data is crucial and expensive, so it becomes urgent to explore labeling-efficient ways to train applications. Semi-supervised techniques for medical image segmentation have demonstrated potential, effectively training models using scarce labeled data alongside a wealth of unlabeled data. Therefore, semi-supervised medical image segmentation is a key issue in engineering applications of medical intelligence. Consistency constraints based on prototype alignment provide an intuitively sensible way to discover valuable insights from unlabeled data that can motivate segmentation performance. In this work, we propose a Dual prototypes Contrastive Network to motivate semi-supervised medical segmentation accuracy by imposing image-level global prototype and pixel-level local prototype constraints. First, we introduce a Background-Separation Global Prototype Contrastive Learning technique that utilizes the natural mutual exclusivity of foreground and background to separate the inter-class distances and encourage the segmentation network to obtain segmentation results that are more complete and do not contain background regions. Second, we design a Cross-Consistent Local Prototype Contrastive Learning techniques to extend the perturbation consistency of the two networks to the prototype's localized response to the feature map, thereby shaping a more stable intra-class prototype space and producing accurate and robust pixel-level predictions. Finally, we comprehensively evaluate our method on mainstream semi-supervised medical image segmentation benchmarks and settings, and experimental results show that our proposed method outperforms current state-of-the-art methods. Specifically, our method achieves a Dice Coefficient score of 91.8 on the Automatic Cardiac Diagnosis Challenge dataset using only 10% labeled data training, 1.1% ahead of the second best method. Code is available at https://github.com/yuelily2024/DPC.
Keyword :
Global prototype contrastive learning Global prototype contrastive learning Label-efficient learning for medical application Label-efficient learning for medical application Local prototype contrastive learning Local prototype contrastive learning Medical image segmentation Medical image segmentation Semi-supervised segmentation Semi-supervised segmentation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Yue, Tianai , Xu, Rongtao , Wu, Jingqian et al. Dual prototypes contrastive learning based semi-supervised segmentation method for intelligent medical applications [J]. | ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE , 2025 , 154 . |
MLA | Yue, Tianai et al. "Dual prototypes contrastive learning based semi-supervised segmentation method for intelligent medical applications" . | ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 154 (2025) . |
APA | Yue, Tianai , Xu, Rongtao , Wu, Jingqian , Yang, Wenjie , Du, Shide , Wang, Changwei . Dual prototypes contrastive learning based semi-supervised segmentation method for intelligent medical applications . | ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE , 2025 , 154 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Locating diverse body parts and perceiving part visibility are essential to person re-identification (re-ID). Most existing methods employ an extra model, e.g., pose estimation or human parsing, to locate parts, or generate pseudo labels to train the part locator incorporated with the re-ID model. In this paper, we aim at learning diverse horizontal stripes with foreground refinement to pursue pixel-level part alignment via only using person identity labels. Specifically, we proposed a Gumbel-Softmax based Differential Categorical Region (DCR) learning method and make two contributions. (1) A stripe-wise regularization. Given an image, the part locator produce part probability maps. The continuous values in the probability maps are discretized into zero or arg max value in the horizontal stripes by the Gumbel-Softmax. Gumbel-Softmax allows us to use the arg max discrete value for part diversity regularization in the forward pass, but can still estimate gradients in the backward pass. (2) A self-refinement method to suppress the background noise in the stripes. We employ a lightweight foreground perception head to produce foreground probability map with only person identity labels supervision. Benefits from discretization of the categorical stripes, we can conveniently obtain the part pseudo label by element-wise multiplying the categorical stripes with foreground probability map. Finally, DCR can locate the body parts at pixel-level and extract part-aligned representation. Experimental results on both holistic and occluded re-ID datasets confirm that our approach significantly improves the learned representation and the achieved performance is on par with the state-of-the-art methods. The code is available at https://github.com/deepalchemist/differentiable-categorical-region
Keyword :
Feature alignment Feature alignment Feature learning Feature learning Person re-identification Person re-identification
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Yang, Wenjie , Xu, Pei . Learning differentiable categorical regions with Gumbel-Softmax for person re-identification [J]. | NEUROCOMPUTING , 2024 , 613 . |
MLA | Yang, Wenjie et al. "Learning differentiable categorical regions with Gumbel-Softmax for person re-identification" . | NEUROCOMPUTING 613 (2024) . |
APA | Yang, Wenjie , Xu, Pei . Learning differentiable categorical regions with Gumbel-Softmax for person re-identification . | NEUROCOMPUTING , 2024 , 613 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Unsupervised domain adaptation (UDA) is critical for remote sensing object detection in real applications, aiming to address the significant performance degradation issue caused by the domain gap between the source and target domain. This method achieves cross-domain alignment by leveraging the unlabeled target domain data, thus avoiding the expensive annotation cost. However, existing works mainly cope with convolutional neural network (CNN)-based object detectors, which are characterized by complex adversarial learning architecture and fail to accurately align the features in remote sensing images with sparsely allocated objects and inevitable background noise. Compared to CNN-based methods, the detection transformer (DETR) largely simplifies the object detection pipeline and demonstrates the great potential of its intrinsic characteristics of global relation modeling between any pixels. On this basis, we propose the first strong DETR-based baseline, remote sensing teacher, for UDA in remote sensing object detection. Specifically, the remote sensing teacher introduces an innovative learnable frequency-enhanced feature alignment (LFA) module. Within this module, we initially transform the features into frequency space to simplify the attention solver and effectively capture domain-specific information. Subsequently, the module significantly enhances the global feature representations of sparsely allocated objects by using a lightweight attention mechanism. Following this, the module incorporates learnable filters with a gated mechanism, enabling selective alignment of features in noisy backgrounds. In addition, the remote sensing teacher employs a self-adaptive pseudo-label assigner (SPA) that can automatically adjust the class-wise confidence threshold according to the model's learning status, thereby enabling the generation of high-quality pseudo-labels in scenarios with a long-tailed distribution. Leveraging these pseudo-labels further mitigates the domain bias of the detector by establishing alignment at the label level. Extensive experimental results demonstrate the superior performance and generalization capabilities of our proposed remote sensing teacher in multiple remote sensing adaptation scenarios. The Code is released at https://github.com/h751410234/RemoteSensingTeacher.
Keyword :
Adaptation models Adaptation models Detectors Detectors Feature extraction Feature extraction Object detection Object detection Remote sensing Remote sensing remote sensing imagery remote sensing imagery Training Training Transformers Transformers unsupervised domain adaptation (UDA) unsupervised domain adaptation (UDA)
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Han, Jianhong , Yang, Wenjie , Wang, Yupei et al. Remote Sensing Teacher: Cross-Domain Detection Transformer With Learnable Frequency-Enhanced Feature Alignment in Remote Sensing Imagery [J]. | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2024 , 62 . |
MLA | Han, Jianhong et al. "Remote Sensing Teacher: Cross-Domain Detection Transformer With Learnable Frequency-Enhanced Feature Alignment in Remote Sensing Imagery" . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 62 (2024) . |
APA | Han, Jianhong , Yang, Wenjie , Wang, Yupei , Chen, Liang , Luo, Zhaoyi . Remote Sensing Teacher: Cross-Domain Detection Transformer With Learnable Frequency-Enhanced Feature Alignment in Remote Sensing Imagery . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2024 , 62 . |
Export to | NoteExpress RIS BibTex |
Version :
Export
Results: |
Selected to |
Format: |