期刊文献+

一种基于数据场的层次聚类方法 被引量:83

An Hierarchical Clustering Method Based on Data Fields
在线阅读 下载PDF
导出
摘要 聚类分析是统计、模式识别和数据挖掘等领域中一个非常重要的研究课题,具有广泛的应用前景.受物理学中场论思想的启发,提出一种基于数据场的层次聚类方法.该方法将物质粒子间的相互作用及其场描述方法引入抽象的数域空间,通过模拟对象在虚拟数据场中的相互作用和运动实现数据对象的自组织层次聚集.实验显示,该方法不依赖于用户输入参数的仔细选择,能够发现任意大小和密度的非球形聚类,对噪声数据不敏感,且具有近似线性的收敛速度. Clustering is a promising application area for many fields including statistics,pattern recognition,data mining, etc. The effectiveness and efficiency of existing clustering techniques, however, is somewhat limited, owing to the huge amounts data collected in databases. According the theory of fields in physics, a hierarchical clustering method based on data fields is presented. The basic idea is that the field models is introduced to describe the virtual interaction among data objects in data space and the hierarchical partitioning of the original dataset is then performed by iteratively simulating the interaction and movement of the data objects in the fields. Experimental results show that the proposed approach not only enjoys favorite clustering quality and requires no careful parameters tuning, but also has a time complexity approximately linear with respect to the size of dataset.
出处 《电子学报》 EI CAS CSCD 北大核心 2006年第2期258-262,共5页 Acta Electronica Sinica
基金 国家自然科学基金(No.60375016 No.60496323)
关键词 聚类分析 层次聚类 数据场 cluster analysis hierarchical clustering data field
作者简介 淦文燕 女,1971年生于江西九江,博士后,主要研究方向为数据挖掘、数字水印。复杂网络。E-mail:wenyangan@163.com。 李德毅 男,1944年生于江苏镇江,博士生导师。中国工程院院士,主要研究方向为人工智能、数据挖掘、指挥自动化、智能控制。
  • 相关文献

参考文献10

  • 1Jain A K,Murty M N,Flynn P J.Data clustering:a review[J].ACM Computing Surveys,1999,31(3):264-323.
  • 2Za(i)ane O R,Foss A,Lee C H,Wang W.On data clustering analysis:scalability,constraints and validation[A].Proceedings of the Sixth Pacific Asia Conference on Knowledge Discovery and Data Mining[C].Taiwan:Springer-Verlag,2002.28-39.
  • 3钱卫宁,周傲英.从多角度分析现有聚类算法(英文)[J].软件学报,2002,13(8):1382-1394. 被引量:86
  • 4Zhang T,Ramakrishnman R,Linvy M.BIRCH:an efficient method for very large databases[A].Proceedings of ACM SIGMOD International Conference on Manangement of Data[C].Canada:ACM Press,1996.103-114.
  • 5Guha S,Rastogi R,Shim K.CURE:an efficient clustering algorithm for large databases[A].Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data[C].Seattle:ACM Press,1998.73-84.
  • 6George K,Han E H,Kumar V.CHAMELEON:a hierarchical clustering algorithm using dynamic modeling[J].IEEE computer,1999,27(3):329-341.
  • 7Wright W E.Gravitational clustering[J].Pattern Recognition,1977,9(3):151-166.
  • 8Oyang Y J,Chen C Y,Yang T W.A study on the hierarchical data clustering algorithm based on gravity theory[A].The 5th European Conference on Principles and Practive of Knowledge Discovery in Databases(PKDD2001)[C].Freiburg:Springer-Verlag,2001.350-361.
  • 9Landau L D,Lifshitz E M.The classical theory of fields[M].Beijing:Beijing World Publishing Ltd,1999.
  • 10淦文燕.聚类-数据挖掘中的基础问题研究[D].南京:解放军理工大学,2003.

二级参考文献36

  • 1[1]Fasulo, D. An analysis of recent work on clustering algorithms. Technical Report, Department of Computer Science and Engineering, University of Washington, 1999. http://www.cs.washington.edu.
  • 2[2]Baraldi, A., Blonda, P. A survey of fuzzy clustering algorithms for pattern recognition. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 1999,29:786~801.
  • 3[3]Keim, D.A., Hinneburg, A. Clustering techniques for large data sets - from the past to the future. Tutorial Notes for ACM SIGKDD 1999 International Conference on Knowledge Discovery and Data Mining. San Diego, CA, ACM, 1999. 141~181.
  • 4[4]McQueen, J. Some methods for classification and Analysis of Multivariate Observations. In: LeCam, L., Neyman, J., eds. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967. 281~297.
  • 5[5]Zhang, T., Ramakrishnan, R., Livny, M. BIRCH: an efficient data clustering method for very large databases. In: Jagadish, H.V., Mumick, I.S., eds. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. Quebec: ACM Press, 1996. 103~114.
  • 6[6]Guha, S., Rastogi, R., Shim, K. CURE: an efficient clustering algorithm for large databases. In: Haas, L.M., Tiwary, A., eds. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. 73~84.
  • 7[7]Beyer, K.S., Goldstein, J., Ramakrishnan, R., et al. When is 'nearest neighbor' meaningful? In: Beeri, C., Buneman, P., eds. Proceedings of the 7th International Conference on Data Theory, ICDT'99. LNCS1540, Jerusalem, Israel: Springer, 1999. 217~235.
  • 8[8]Ester, M., Kriegel, H.-P., Sander, J., et al. A density-based algorithm for discovering clusters in large spatial databases with noises. In: Simoudis, E., Han, J., Fayyad, U.M., eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD'96). AAAI Press, 1996. 226~231.
  • 9[9]Ester, M., Kriegel, H.-P., Sander, J., et al. Incremental clustering for mining in a data warehousing environment. In: Gupta, A., Shmueli, O., Widom, J., eds. Proceedings of the 24th International Conference on Very Large Data Bases. New York: Morgan Kaufmann, 1998. 323~333.
  • 10[10]Sander, J., Ester, M., Kriegel, H.-P., et al. Density-Based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 1998,2(2):169~194.

共引文献85

同被引文献807

引证文献83

二级引证文献658

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部