1673-159X

CN 51-1686/N

基于异构图自编码器的端到端聚类方法

An End-To-End Heterogeneous Graph Clustering Method

  • 摘要: 异构图聚类是数据挖掘的一个基础且困难的任务。在保存异构图结构信息的同时完成聚类是一个挑战。为此,文章提出了一种端到端的异构图聚类方法,旨在联合优化学习异构图节点表示过程与聚类过程:通过使用异构图自编码器对异构图建模,学习其节点表示,同时构建辅助分布以聚类为导向联合指导节点表示的生成,从而使学习到的节点表示不仅保存了异构图的结构信息,也使其在向量空间中逐渐分开达到聚类的目的。实验结果表明,联合学习节点表示与聚类的方法相比较传统分开学习的方法,在F1值、标准化互信息和调兰德指数等指标上都表现出更好的聚类效果。

     

    Abstract: Heterogeneous graph clustering is a fundamental and difficult task in data mining. It is a great challenge to complete the clustering process while preserving the structural information of heterogeneous graphs. Therefore, an end-to-end heterogeneous graph clustering method is proposed, which aims to jointly and optimally learn the heterogeneous graph node representation process and the clustering process. Specifically, we use heterogeneous graph auto-encoders to model heterogeneous graphs and learn their node representations. Meanwhile we jointly guiding the generation of node representations by constructing auxiliary distributions oriented to clustering. So, the learned node representations not only preserve the structural information of heterogeneous graphs, but also make them separate in the vector space for clustering purposes. The experimental results show that the joint learning of node representations and clustering has better performance than the traditional separate learning method.

     

/

返回文章
返回