期刊文献+

基于信念子簇切割的模糊聚类算法

Fuzzy clustering algorithm based on belief subcluster cutting
在线阅读 下载PDF
导出
摘要 信念峰值聚类(BPC)算法是一种基于模糊视角的密度峰值聚类(DPC)算法的新变体,它用模糊数学的观点刻画数据的分布特征与相关性。但BPC算法的信念值计算主要基于局部数据点信息,未考察数据集整体的分布和结构,且原始的分配策略鲁棒性弱。针对以上问题,提出一种基于信念子簇切割的模糊聚类算法(BSCC),所提算法结合了信念峰值和谱方法。首先,通过局部信念信息将数据集划分为众多高纯度子簇;其次,将子簇视作新样本,通过簇间的相似关系,利用谱方法进行割图聚类,从而耦合局部信息与全局信息;最后,将子簇内的点分配至子簇所在类簇以完成最终聚类。与BPC算法相比,BSCC在带有多子簇结构的数据集上具有明显优势,如在americanflag数据集和Car数据集上的准确率(ACC)分别提高了16.38个百分点和21.35个百分点。在合成数据集和真实数据集上的聚类实验结果表明,BSCC在调整兰德系数(ARI)、归一化互信息(NMI)和ACC这3个评价指标上整体优于BPC和其他7种聚类算法。 Belief Peaks Clustering(BPC)algorithm is a new variant of Density Peaks Clustering(DPC)algorithm based on fuzzy perspective.It uses fuzzy mathematics to describe the distribution characteristics and correlation of data.However,BPC algorithm mainly relies on the information of local data points in the calculation of belief values,instead of investigating the distribution and structure of the whole dataset.Moreover,the robustness of the original allocation strategy is weak.To solve these problems,a fuzzy Clustering algorithm based on Belief Subcluster Cutting(BSCC)was proposed by combining belief peaks and spectral method.Firstly,the dataset was divided into many high-purity subclusters by local belief information.Then,the subcluster was regarded as a new sample,and the spectral method was used for cutting graph clustering through the similarity relationship between clusters,thus coupling local information and global information.Finally,the points in the subcluster were assigned to the class cluster where the subcluster was located to complete the final clustering.Compared with BPC algorithm,BSCC has obvious advantages on datasets with multiple subclusters,and it has the ACCuracy(ACC)improvement of 16.38 and 21.35 percentage points on americanflag dataset and Car dataset,respectively.Clustering experimental results on synthetic datasets and real datasets show that BSCC outperforms BPC and the other seven clustering algorithms on the three evaluation indicators of Adjusted Rand Index(ARI),Normalized Mutual Information(NMI)and ACC.
作者 丁雨 张瀚霖 罗荣 孟华 DING Yu;ZHANG Hanlin;LUO Rong;MENG Hua(School of Mathematics,Southwest Jiaotong University,Chengdu Sichuan 611756,China)
出处 《计算机应用》 CSCD 北大核心 2024年第4期1128-1138,共11页 journal of Computer Applications
基金 中央高校基本科研业务费专项资金资助项目(2682023ZTPY027)。
关键词 聚类分析 密度峰值聚类 信念峰值聚类 谱聚类 信念子簇 子簇合并 clustering analysis density peaks clustering belief peaks clustering spectral clustering belief subcluster subcluster merging
作者简介 丁雨(1999-),女,四川成都人,硕士研究生,主要研究方向:机器学习、聚类分析;张瀚霖(1998-),男,四川武胜人,硕士研究生,主要研究方向:机器学习、数据的特征提取与降维、聚类分析;通信作者:罗荣(1980-),男,四川巴中人,副教授,博士,主要研究方向:代数编码、数据挖掘,电子邮箱luorong@swjtu.edu.cn;孟华(1982-),男,河北邢台人,副教授,博士,CCF会员,主要研究方向:深度学习的可解释性、拓扑数据分析、知识表示与推理。
  • 相关文献

参考文献2

二级参考文献22

  • 1Xu Rui, Wunsch D II. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678.
  • 2Kaufman L, Peter R. Clustering by Means of Medoids [G] // Statistical Data Analysis Based on the IA Norm and Related Methods. North-Holland: North-Holland Press, 1987: 405- 416.
  • 3MacQueen J. Some methods for classification and analysis of multivariate observations[C] //Proc of the 5th Berkeley Symp on Mathematical Statistics and Probability. Berkeley, CA: University of California Press, 1967 281-297.
  • 4Zhang W, Wang X, Zhao D, et al. Graph Degree Linkage: Agglomerative Clustering on a Directed Graph [M] . Berlin: Springer, 2012:428-441.
  • 5Ester M, Kriegel H P, Sander J, et al. A density based algorithm for discovering clusters in large spatial databases with noise [C] //Proc of ACM KDD'96. New York: ACM, 1996:226-231.
  • 6Wang W, Jiong Y, Muntz R. STING: A statistical information grid approach to spatial data mining [C]//Proc of VLDB'97. San Francisco, CA: Morgan Kau{mann, 1997: 186-195.
  • 7Alex R, Alessandro L. Clustering by fast search and find of density peaks [J]. Science, 2014, 344(1492) :1492-1496.
  • 8Jeffrey D, Sanay G. MapReduce.. Simplified data processing on large clusters [J]. Communications of the ACM, 2004, 51(1) : 107-113.
  • 9Akdogan A, Demiryurek U, Banael Kashani F, et al. Voronoi-based geospatial query processing with MapReduee [C]//Proc of CloudCom '10. Piscataway, NJ: IEEE, 2010: 9-16.
  • 10Lu Wei, Shen Yanyan, Chen Su, etc. Efficient processing of k nearest neighbor joins using MapReduce [J]. VLDB Endowment, 2012, 5(10)= 1016-1027.

共引文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部