期刊文献+

数据质量聚类算法 被引量:3

Clustering Data with Mass
原文传递
导出
摘要 在聚类算法中,聚类中心决定聚类的最终结果,而传统的分割聚类算法不能准确定位聚类中心。根据数据场提出了数据质量聚类中心的新概念,给出数据质量聚类算法,能够一次定位聚类中心,无需迭代,也无需预置聚类个数。7组对比实验表明,提出的方法能够准确定位聚类中心,获得良好的聚类结果和稳定性,优于传统的分割聚类算法和峰值密度聚类算法。 The clustering center has a great effect on the clustering result. In this paper, a new concept of the data mass is proposed. The mass of data represents one of the inherent attributes of the data. With different view angles of data mining, the data mass maybe different. Based on the concept of data mass, a new clustering algorithm, which is clustering data with mass, is put forward. This new algorithm finds the clustering centers based on two attributes of data: the data mass and the data distance. And it can complete the clustering process with only one pass of the whole dataset. Experimental results show that the proposed algorithm can find the clustering center accurately and can get better clustering result than the same typical clustering algorithms, such as K-means, K-medoids and clustering by fast search and find of density peaks.
作者 李延 王大魁 耿晶 王树良 LI Yan;WANG Dakui;GENG Jing;WANG Shuliang(School of Software,Beijing Institute of Technology,Beijing 100081,China;Institute of Information Engineering,Chinese Academy of Sciences,Beljing 100093,China)
出处 《武汉大学学报(信息科学版)》 EI CSCD 北大核心 2019年第1期153-158,共6页 Geomatics and Information Science of Wuhan University
基金 国家自然科学基金(61472039) 高等学校博士学科点专项科研基金(20121101110036)~~
关键词 数据场 聚类 数据质量 聚类中心 data field cluster data mass clustering center
作者简介 李延,博士生,主要从事数据挖掘方面的研究。liy_007@126.com;通讯作者:王树良,博士,教授。slwang2011@bit.edu.cn.
  • 相关文献

参考文献1

二级参考文献25

  • 1A. Rodriguez and A. Laio, "Clustering by fast search and find of density peaks", Science, Voi.344, No.6191, pp.1492-1496, 2014.
  • 2United Nations Global Pulse, Big Data for Development: Chal- lenges & Opportunities, http://unglobalpulse.org/, 2012.
  • 3C. Seife, "Big data: The revolution is digitized", Nature, Vol.518, pp.480-481, 2014.
  • 4L. Einav and J. Levin, "Economics in the age of big data", Science, Vol.346, No.6210, pp.715, 2014.
  • 5E.E. Schadt, M.D. Linderman, J. Sorenson, L. Lee and G.P. Nolan, "Computational solutions to large-scale data manage- ment and analysis", Nature Reviews Genetics, Vol.ll, pp.647- 657, 2010.
  • 6S.L. Wang, W.Y. Gan, D.Y. Li and D.R. Li, "Data field for hierarchical clustering", International Journal of Data Ware- housing and Mining, Vol.7, No.2, pp.43-63, 2011.
  • 7A. Rajaraman and J.D. Ullman, Mining of Massive Datasets, Cambridge University Press, London, UK, 2011.
  • 8R. Xu and D. Wunsch, "Survey of clustering algorithms", IEEE Transactions on Neural Networks, Vol.16, No.3, pp.645-678, 2005.
  • 9C.C. Aggarwal and C.K. Reddy, Data Clustering: Algorithms and Applications, CRC Press, New York, USA, 2014.
  • 10D.R. Li, S.L. Wang, D.Y. Li, Spatial Data Mining Theories and Applications (second edition), Science Press, Beijing, China, 2013.

共引文献63

同被引文献31

引证文献3

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部