Query:
学者姓名:王石平
Refining:
Year
Type
Indexed by
Source
Complex
Co-
Language
Clean All
Abstract :
Due to the heterogeneity gap in multi-view data, researchers have been attempting to apply these data to learn a co-latent representation to bridge this gap. However, multi-view representation learning still confronts two challenges: (1) it is hard to simultaneously consider the performance of downstream tasks and the interpretability and transparency of the network; (2) it fails to learn representations that accurately describe the class boundaries of downstream tasks. To overcome these limitations, we propose an interpretable representation learning framework, named interpretable multi-view proximity representation learning network. On the one hand, the proposed network is customized by an explicitly designed optimization objective that enables it to learn semantic co-latent representations while maintaining the interpretability and transparency of the network from the design level. On the other hand, the designed multi-view proximity representation learning objective function encourages its learned co-latent representations to form intuitive class boundaries by increasing the inter-class distance and decreasing the intra-class distance. Driven by a flexible downstream task loss, the learned co-latent representation can adapt to various multi-view scenarios and has been shown to be effective in experiments. As a result, this work provides a feasible solution to a generalized multi-view representation learning framework and is expected to accelerate the research and exploration in this field. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
Keyword :
Deep learning Deep learning Multi-view learning Multi-view learning Proximity learning Proximity learning Representation learning Representation learning
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lan, S. , Fang, Z. , Du, S. et al. IMPRL-Net: interpretable multi-view proximity representation learning network [J]. | Neural Computing and Applications , 2024 , 36 (24) : 15027-15044 . |
MLA | Lan, S. et al. "IMPRL-Net: interpretable multi-view proximity representation learning network" . | Neural Computing and Applications 36 . 24 (2024) : 15027-15044 . |
APA | Lan, S. , Fang, Z. , Du, S. , Cai, Z. , Wang, S. . IMPRL-Net: interpretable multi-view proximity representation learning network . | Neural Computing and Applications , 2024 , 36 (24) , 15027-15044 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Most of existing correspondence pruning methods only concentrate on gathering the context information as much as possible while neglecting effective ways to utilize such information. In order to tackle this dilemma, in this paper we propose Graph Context Transformation Network (GCT-Net) enhancing context information to conduct consensus guidance for progressive correspondence pruning. Specifically, we design the Graph Context Enhance Transformer which first generates the graph network and then transforms it into multi-branch graph contexts. Moreover, it employs self-attention and cross-attention to magnify characteristics of each graph context for emphasizing the unique as well as shared essential information. To further apply the recalibrated graph contexts to the global domain, we propose the Graph Context Guidance Transformer. This module adopts a confident-based sampling strategy to temporarily screen high-confidence vertices for guiding accurate classification by searching global consensus between screened vertices and remaining ones. The extensive experimental results on outlier removal and relative pose estimation clearly demonstrate the superior performance of GCT-Net compared to state-of-the-art methods across outdoor and indoor datasets. Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Keyword :
Artificial intelligence Artificial intelligence
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Guo, Junwen , Xiao, Guobao , Wang, Shiping et al. Graph Context Transformation Learning for Progressive Correspondence Pruning [C] . 2024 : 1968-1975 . |
MLA | Guo, Junwen et al. "Graph Context Transformation Learning for Progressive Correspondence Pruning" . (2024) : 1968-1975 . |
APA | Guo, Junwen , Xiao, Guobao , Wang, Shiping , Yu, Jun . Graph Context Transformation Learning for Progressive Correspondence Pruning . (2024) : 1968-1975 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Deep learning-based clustering methods, especially those incorporating deep generative models, have recently shown noticeable improvement on many multimedia benchmark datasets. However, existing generative models still suffer from unstable training, and the gradient vanishes, which results in the inability to learn desirable embedded features for clustering. In this paper, we aim to tackle this problem by exploring the capability of Wasserstein embedding in learning representative embedded features and introducing a new clustering module for jointly optimizing embedding learning and clustering. To this end, we propose Wasserstein embedding clustering (WEC), which integrates robust generative models with clustering. By directly minimizing the discrepancy between the prior and marginal distribution, we transform the optimization problem of Wasserstein distance from the original data space into embedding space, which differs from other generative approaches that optimize in the original data space. Consequently, it naturally allows us to construct a joint optimization framework with the designed clustering module in the embedding layer. Due to the substitutability of the penalty term in Wasserstein embedding, we further propose two types of deep clustering models by selecting different penalty terms. Comparative experiments conducted on nine publicly available multimedia datasets with several state-of-the-art methods demonstrate the effectiveness of our method.
Keyword :
auto-encoder auto-encoder clustering analysis clustering analysis Clustering methods Clustering methods Data models Data models Decoding Decoding Deep learning Deep learning Generative adversarial networks Generative adversarial networks generative models generative models Task analysis Task analysis Training Training Unsupervised learning Unsupervised learning Wasserstein embedding Wasserstein embedding
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Cai, Jinyu , Zhang, Yunhe , Wang, Shiping et al. Wasserstein Embedding Learning for Deep Clustering: A Generative Approach [J]. | IEEE TRANSACTIONS ON MULTIMEDIA , 2024 , 26 : 7567-7580 . |
MLA | Cai, Jinyu et al. "Wasserstein Embedding Learning for Deep Clustering: A Generative Approach" . | IEEE TRANSACTIONS ON MULTIMEDIA 26 (2024) : 7567-7580 . |
APA | Cai, Jinyu , Zhang, Yunhe , Wang, Shiping , Fan, Jicong , Guo, Wenzhong . Wasserstein Embedding Learning for Deep Clustering: A Generative Approach . | IEEE TRANSACTIONS ON MULTIMEDIA , 2024 , 26 , 7567-7580 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
In this article, we propose a novel pyramid transformer network (PT-Net) for feature matching problems. Recent studies have used the dense motion field to transform unordered correspondences into ordered motion vectors and have used convolutional neural networks (CNNs) to extract deep features. However, the limited receptive field of CNNs restricts the ability of the network to capture global information within the motion field. To tackle this limitation, we devise a pyramid transformer (PT) block to enhance the models ability to extract both local and global information from the motion field, which fuses multiscale motion field information by constructing a pyramid-structured motion field. Furthermore, to alleviate the high memory demands of spatial attention in the transformer, we introduce dilated sparse attention (DSA), a novel attention block that reduces the computational difficulty of multihead self-attention (MHSA) through regular interval sampling and deconvolution operations and focuses on the essential regions to establish long-range dependencies between the correct motion vectors. The proposed PT-Net is effective in inferring the probabilities of correspondences belonging to either inliers or outliers, while simultaneously estimating the essential matrix. Extensive experiments demonstrate that PT-Net network outperforms state-of-the-art methods for outlier removal tasks and camera pose estimation on different datasets, including YFCC100M and SUN3D. The code is available at https://github.com/gongzhepeng/PT-Net.
Keyword :
Camera pose estimation Camera pose estimation Cameras Cameras deep learning deep learning Feature extraction Feature extraction feature matching feature matching Memory management Memory management outlier removal outlier removal Pose estimation Pose estimation Task analysis Task analysis Transformers Transformers Vectors Vectors
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Gong, Zhepeng , Xiao, Guobao , Shi, Ziwei et al. PT-Net: Pyramid Transformer Network for Feature Matching Learning [J]. | IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT , 2024 , 73 . |
MLA | Gong, Zhepeng et al. "PT-Net: Pyramid Transformer Network for Feature Matching Learning" . | IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 73 (2024) . |
APA | Gong, Zhepeng , Xiao, Guobao , Shi, Ziwei , Wang, Shiping , Chen, Riqing . PT-Net: Pyramid Transformer Network for Feature Matching Learning . | IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT , 2024 , 73 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Multi-view data containing complementary and consensus information can facilitate representation learning by exploiting the intact integration of multi-view features. Because most objects in the real world often have underlying connections, organizing multi-view data as heterogeneous graphs is beneficial to extracting latent information among different objects. Due to the powerful capability to gather information of neighborhood nodes, in this article, we apply Graph Convolutional Network (GCN) to cope with heterogeneous graph data originating from multi-view data, which is still under-explored in the field of GCN. In order to improve the quality of network topology and alleviate the interference of noises yielded by graph fusion, some methods undertake sorting operations before the graph convolution procedure. These GCN-based methods generally sort and select the most confident neighborhood nodes for each vertex, such as picking the top-k nodes according to pre-defined confidence values. Nonetheless, this is problematic due to the non-differentiable sorting operators and inflexible graph embedding learning, which may result in blocked gradient computations and undesired performance. To cope with these issues, we propose a joint framework dubbed Multi-view Graph Convolutional Network with Differentiable Node Selection (MGCN-DNS), which is constituted of an adaptive graph fusion layer, a graph learning module, and a differentiable node selection schema. MGCN-DNS accepts multi-channel graph-structural data as inputs and aims to learn more robust graph fusion through a differentiable neural network. The effectiveness of the proposed method is verified by rigorous comparisons with considerable state-of-the-art approaches in terms of multi-view semi-supervised classification tasks, and the experimental results indicate that MGCN-DNS achieves pleasurable performance on several benchmark multi-view datasets.
Keyword :
differentiable node selection differentiable node selection graph convolutional network graph convolutional network Multi-view learning Multi-view learning semi-supervised classification semi-supervised classification
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Chen, Zhaoliang , Fu, Lele , Xiao, Shunxin et al. Multi-View Graph Convolutional Networks with Differentiable Node Selection [J]. | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA , 2024 , 18 (1) . |
MLA | Chen, Zhaoliang et al. "Multi-View Graph Convolutional Networks with Differentiable Node Selection" . | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA 18 . 1 (2024) . |
APA | Chen, Zhaoliang , Fu, Lele , Xiao, Shunxin , Wang, Shiping , Plant, Claudia , Guo, Wenzhong . Multi-View Graph Convolutional Networks with Differentiable Node Selection . | ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA , 2024 , 18 (1) . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Heterogeneous graph neural networks play a crucial role in discovering discriminative node embeddings and relations from multi -relational networks. One of the key challenges in heterogeneous graph learning lies in designing learnable meta -paths, which significantly impact the quality of learned embeddings. In this paper, we propose an Attributed Multi -Order Graph Convolutional Network (AMOGCN), which automatically explores meta -paths that involve multi -hop neighbors by aggregating multi -order adjacency matrices. The proposed model first constructs different orders of adjacency matrices from manually designed node connections. Next, AMOGCN fuses these various orders of adjacency matrices to create an intact multi -order adjacency matrix. This process is supervised by the node semantic information, which is extracted from the node homophily evaluated by attributes. Eventually, we employ a one -layer simplifying graph convolutional network with the learned multi -order adjacency matrix, which is equivalent to the cross -hop node information propagation with multilayer graph neural networks. Substantial experiments reveal that AMOGCN achieves superior semi -supervised classification performance compared with state-of-the-art competitors.
Keyword :
Graph convolutional networks Graph convolutional networks Heterogeneous graphs Heterogeneous graphs Multi-order adjacency matrix Multi-order adjacency matrix Semi-supervised classification Semi-supervised classification
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Chen, Zhaoliang , Wu, Zhihao , Zhong, Luying et al. Attributed Multi-Order Graph Convolutional Network for Heterogeneous Graphs [J]. | NEURAL NETWORKS , 2024 , 174 . |
MLA | Chen, Zhaoliang et al. "Attributed Multi-Order Graph Convolutional Network for Heterogeneous Graphs" . | NEURAL NETWORKS 174 (2024) . |
APA | Chen, Zhaoliang , Wu, Zhihao , Zhong, Luying , Plant, Claudia , Wang, Shiping , Guo, Wenzhong . Attributed Multi-Order Graph Convolutional Network for Heterogeneous Graphs . | NEURAL NETWORKS , 2024 , 174 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Multi-view subspace approaches have been extensively studied for their ability to project data onto a low-dimensional space, which is in favour of the clustering task. However, most existing models mainly concentrate on reconstructing data from the sample space, neglecting crucial information from the feature space, and failing to learn an optimal representation. For addressing this issue, we present a new joint framework, dubbed low-rank tensor learning with projection distance metric. This model recovers the original data by learning two low-rank factors, which thoroughly exploits the essential data information. Specifically, a low-rank constraint is introduced on a tensor that integrates subspace representations of all view data, enabling it to capture high-order relationships among views while recovering data from the sample space. Meanwhile, a low-rank projection matrix calculated by decomposing the original features is utilized to enhance data structures via exploring relationships among feature dimensions. Additionally, a distance metric learned by the projection matrix is introduced to leverage the local structure embedded in samples, thereby encouraging the learned representation to be more discriminative. Extensive experimental results on six datasets indicate the superiority of the proposed model.
Keyword :
Low-rank tensor Low-rank tensor Multi-view learning Multi-view learning Projection distance Projection distance Representation learning Representation learning Subspace clustering Subspace clustering
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Huang, Sujia , Fu, Lele , Du, Shide et al. Low-rank tensor learning with projection distance metric for multi-view clustering [J]. | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS , 2024 . |
MLA | Huang, Sujia et al. "Low-rank tensor learning with projection distance metric for multi-view clustering" . | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS (2024) . |
APA | Huang, Sujia , Fu, Lele , Du, Shide , Wu, Zhihao , Vasilakos, Athanasios V. , Wang, Shiping . Low-rank tensor learning with projection distance metric for multi-view clustering . | INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS , 2024 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
With the rapid growth of activities on the web, large amounts of interaction data on multimedia platforms are easily accessible, including e-commerce, music sharing, and social media. By discovering various interests of users, recommender systems can improve user satisfaction without accessing overwhelming personal information. Compared to graph-based models, hypergraph-based collaborative filtering has the ability to model higher-order relations besides pair-wise relations among users and items, where the hypergraph structures are mainly obtained from specialized data or external knowledge. However, the above well-constructed hypergraph structures are often not readily available in every situation. To this end, we first propose a novel framework named HGRec, which can enhance recommendation via automatic hypergraph generation. By exploiting the clustering mechanism based on the user/item similarity, we group users and items without additional knowledge for hypergraph structure learning and design a cross-view recommendation module to alleviate the combinatorial gaps between the representations of the local ordinary graph and the global hypergraph. Furthermore, we devise a sparse optimization strategy to ensure the effectiveness of hypergraph structures, where a novel integration of the l( 2,1)-norm and optimal transport framework is designed for hypergraph generation. We term the model HGRec with sparse optimization strategy as HGRec++. Extensive experiments on public multi-domain datasets demonstrate the superiority brought by our HGRec++, which gains average 8.1% and 9.8% improvement over state-of-the-art baselines regarding Recall and NDCG metrics, respectively.
Keyword :
graph convolutional network graph convolutional network hypergraph generation hypergraph generation Recommender systems Recommender systems sparse optimization sparse optimization
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lin, Zhenghong , Yan, Qishan , Liu, Weiming et al. Automatic Hypergraph Generation for Enhancing Recommendation With Sparse Optimization [J]. | IEEE TRANSACTIONS ON MULTIMEDIA , 2024 , 26 : 5680-5693 . |
MLA | Lin, Zhenghong et al. "Automatic Hypergraph Generation for Enhancing Recommendation With Sparse Optimization" . | IEEE TRANSACTIONS ON MULTIMEDIA 26 (2024) : 5680-5693 . |
APA | Lin, Zhenghong , Yan, Qishan , Liu, Weiming , Wang, Shiping , Wang, Menghan , Tan, Yanchao et al. Automatic Hypergraph Generation for Enhancing Recommendation With Sparse Optimization . | IEEE TRANSACTIONS ON MULTIMEDIA , 2024 , 26 , 5680-5693 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information. One is to generate sparse attention coefficients associated with acoustic and visual modalities, which helps locate critical emotional semantics. The other is fusing complementary cross-modal representation to construct optimal salient feature combinations of multiple modalities. A Conditional Transformer Fusion Network is proposed to handle these problems. Firstly, the authors equip the transformer module with CNN layers to enhance the detection of subtle signal patterns in nonverbal sequences. Secondly, sentiment words are utilised as context conditions to guide the computation of cross-modal attention. As a result, the located nonverbal features are not only salient but also complementary to sentiment words directly. Experimental results show that the authors' method achieves state-of-the-art performance on several multimodal affective analysis datasets.
Keyword :
affective computing affective computing data fusion data fusion information fusion information fusion multimodal approaches multimodal approaches
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wang, Jianwen , Wang, Shiping , Xiao, Shunxin et al. Conditional selection with CNN augmented transformer for multimodal affective analysis [J]. | CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY , 2024 , 9 (4) : 917-931 . |
MLA | Wang, Jianwen et al. "Conditional selection with CNN augmented transformer for multimodal affective analysis" . | CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 9 . 4 (2024) : 917-931 . |
APA | Wang, Jianwen , Wang, Shiping , Xiao, Shunxin , Lin, Renjie , Dong, Mianxiong , Guo, Wenzhong . Conditional selection with CNN augmented transformer for multimodal affective analysis . | CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY , 2024 , 9 (4) , 917-931 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Graph Convolutional Network (GCN) has drawn widespread attention in data mining on graphs due to its outstanding performance and rigor theoretical guarantee. However, some recent studies have revealed that GCN-based methods may mine latent information insufficiently owing to the underutilization of the feature space. Besides, the unlearnable topology also significantly imperils the performance of GCN-based methods. In this paper, we conduct experiments to investigate these issues, finding that GCN does not fully consider the potential structure in the feature space, and a fixed topology deteriorates the robustness of GCN. Thus, it is desired to distill node features and establish a learnable graph. Motivated by this goal, we propose a framework dubbed Graph Convolutional Network with elastic topology (GCNet1). With the analysis of the optimization for the proposed flexible Laplacian embedding, GCNet is naturally constructed by alternative graph convolutional layers and adaptive topology learning layers. GCNet aims to deeply explore the feature space and employ the mined information to construct a learnable topology, which leads to a more robust graph representation. In addition, a set -level orthogonal loss is utilized to meet the orthogonal constraint required by the flexible Laplacian embedding and promote better class separability. Moreover, comprehensive experiments indicate that GCNet achieves remarkable performance and generalization on several real -world datasets.
Keyword :
Graph convolutional networks Graph convolutional networks Learnable topology Learnable topology Orthogonal constraint Orthogonal constraint Semi-supervised classification Semi-supervised classification
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wu, Zhihao , Chen, Zhaoliang , Du, Shide et al. Graph Convolutional Network with elastic topology [J]. | PATTERN RECOGNITION , 2024 , 151 . |
MLA | Wu, Zhihao et al. "Graph Convolutional Network with elastic topology" . | PATTERN RECOGNITION 151 (2024) . |
APA | Wu, Zhihao , Chen, Zhaoliang , Du, Shide , Huang, Sujia , Wang, Shiping . Graph Convolutional Network with elastic topology . | PATTERN RECOGNITION , 2024 , 151 . |
Export to | NoteExpress RIS BibTex |
Version :
Export
Results: |
Selected to |
Format: |