摘要
针对K-means算法随机选择初始聚类中心所出现的样本聚类结果随机性强、稳定性低、容易陷入局部最优和得不到全局最优解等问题,提出一种基于均值与最大距离乘积的初始聚类中心优化K-means算法。该算法首先选择距离样本集均值最远的数据对象加入聚类中心集合,再依次将与样本集均值和当前聚类中心乘积最大的数据对象加入聚类中心集合。标准数据集上的实验结果表明,与原始K-means的算法以及另一种改进算法相比,新提出的聚类算法具有更高的准确率。
Aiming at solving the problem of clustering results randomness,low stability,easy to fall into local optimum and no global optimal solution of K-means algorithm randomly chosen initial cluster centers,a kind of initial cluster center optimization K-means algorithm based on the product of the mean and maximum distance is put forward.Firstly,the farthest distance mean sample set of data objects are chosen to join the cluster center set,then the sample mean and maximum current cluster center product data object are set in turn to join the cluster center collection.Experimental results on the standard data sets show that,compared with the original K-means algorithm and another improved algorithm,the proposed new clustering algorithm has a higher accuracy rate.
出处
《计算机与数字工程》
2015年第3期379-382,共4页
Computer & Digital Engineering
基金
2013年广东省高职教育教学指导委员会教改项目(编号:XXJS-2013-2041)
广东松山职业技术学院技术应用重点课题(编号:2012-JYKY-19)资助
作者简介
段桂芹,女,硕士,讲师,研究方向:数据挖掘、多媒体技术。