摘要
现有的同步聚类方法Sync在同步过程中需要将样本中的每一个分量看作相位振子进行计算,具有较高的时间复杂度,因此在大规模数据集上聚类时具有相当大的局限性.为了解决这一问题,提出了快速自适应同步聚类方法(fast adaptive KDE-based clustering by synchronization,FAKCS).FAKCS首先引入基于压缩集密度估计和中心约束最小包含球技术的快速压缩方法对大规模数据集进行压缩,然后通过使用Davies-Bouldin指标,在压缩集上进行ε参数自适应的同步聚类,并采用新定义的序列参量来评价局部同步的程度.另外,研究了序列参量和核密度估计间的联系,从理论上揭示了样本点的局部同步在概率密度意义下的本质.FAKCS可以在大规模数据集上得到任意形状、个数、密度的聚类而无需预设聚类数目.在图像分割和大规模UCI数据集上的实验验证了FAKCS的有效性.
The existing synchronization clustering algorithm Sync regards each attribute of a sample as a phase oscillator in the synchronization process.As a result,the algorithm has higher time complexity and can not be well used on large scale datasets.To solve this problem,we propose a novel fast adaptive clustering algorithm FAKCS in this paper.Firstly,FAKCS introduces a method based on RSDE and CCMEB technology to extract the samples from the original dataset.Then it begins clustering adaptively by using the Davies-Bouldin cluster criterion and the new order parameter which can observe the degree of local synchronization.Moreover,the relationship between the new order parameter and KDE is found in this paper,which reveals the probability density nature of local synchronization.FAKCS can detect clusters of arbitrary shape,number and density on large scale datasets without setting cluster number previously.The effectiveness of the proposed method has been demonstrated in image segmentation examples and experiments on large UCI datasets.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2014年第4期707-720,共14页
Journal of Computer Research and Development
基金
国家自然科学基金项目(61272210
61202311)
江苏省自然科学基金项目(BK2012552
BK2012209)
关键词
核密度估计
最小包含球
同步
压缩集密度估计
聚类
kernel density estimation
minimal enclosing ball
synchronization
reduced set density estimator
clustering
作者简介
(cslgywh@163.com)Ying Wenhao, born in 1979. PhD. Lecturer at the School of Computer Science and Engineering, Changshu Institute of Technology. His research interest covers pattern recognition, intelligent computation.
Xu Min, born in 1980. PhD candidate at the School of Digital Media, Jiangnan University. Her research interest covers pattern recognition, intelligent computation (xum@wxit. edu. cn).
Wang Shitong, born in 1964. Professor at the School of Digital Media, Jiangnan University. His research interest covers artificial intelligence, pattern recognition and bioinformatics (wxwangst@yahoo. com. cn).
Deng Zhaohong, born in 1981. PhD. Associate professor at the School of Digital Media, Jiangnan University. His research interest covers fuzzy modeling and intelligent computation (dzh666828@yahoo. com. cn).