期刊文献+

基于改进人工蜂群算法与MapReduce的大数据聚类算法 被引量:14

Clustering algorithm of big data based on improved artificial bee colony algorithm and MapReduce
在线阅读 下载PDF
导出
摘要 针对大数据聚类算法计算效率与聚类性能较低的问题,提出了一种基于改进人工蜂群算法与MapReduce的大数据聚类算法。将灰狼优化算法与人工蜂群算法结合,同时提高人工蜂群算法的搜索能力与开发能力,该策略能够有效地提高聚类处理的性能;采用混沌映射与反向学习作为ABC种群的初始化策略,提高搜索的解质量;将聚类算法基于Hadoop的MapReduce编程模型实现,通过最小化类内距离的平方和实现对大数据的聚类处理。实验结果表明,该算法有效地提高了大数据集的聚类质量,同时加快了聚类速度。 Aiming at the problems of low computational efficiency and low clustering performance of clustering algorithms for big data,this paper proposed a clustering algorithm of big data based on the improved ABC algorithm and MapReduce. This algorithm combined the grey wolf optimizer algorithm and ABC algorithm,and improved the exploration and exploitation of the ABC algorithm simultaneously,it could help to improve the clustering performance effectively. The algorithm utilized the chaotic map and backward learning as the initial strategy of ABC colony to improve the solution quality of search procedure. It realized the clustering algorithm based on MapReduce programming model,and realized the clustering process for big data by minimizing the quadratic sum of inner class distances. Experimental results demonstrate that the proposed algorithm improves the clustering quality of big data,and speedups the clustering procedure.
作者 孙倩 陈昊 李超 Sun Qian;Chen Hao;Li Chao(Informationization Management Department,Hubei University,Wuhan 430062,China;School of Computer Science&Information Engineering,Hubei University,Wuhan 430062,China)
出处 《计算机应用研究》 CSCD 北大核心 2020年第6期1707-1710,1764,共5页 Application Research of Computers
基金 湖北省教育厅科学技术研究重点项目(D20141005)。
关键词 数据分析 聚类算法 人工蜂群算法 灰狼优化算法 云计算 分布式计算 data analysis clustering algorithm artificial bee colony algorithm(ABC) grey wolf optimizer algorithm(GWO) cloud computing distributed computing
作者简介 孙倩(1980-),女,山东文登人,高级实验师,硕士,主要研究方向为信息安全、系统分析与集成(sunqianaro@126.com);陈昊(1977-),男,教授,博士,主要研究方向为软件工程、智能计算;李超(1965-),男,湖北新洲人,高级实验师,主要研究方向为信息安全、计算机网络.
  • 相关文献

参考文献7

二级参考文献51

  • 1江小平,李成华,向文,张新访,颜海涛.k-means聚类算法的MapReduce并行化实现[J].华中科技大学学报(自然科学版),2011,39(S1):120-124. 被引量:80
  • 2Rui Xu,Donald Wunsch II.Survey of Clustering Algorithms. IEEE Transactions on Neural Networks . 2005
  • 3Harsha S. Nagesh,Sanjay Goil,Alok N. Choudhary.A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets[].International Conference on Parallel Processing.2000
  • 4Zhang T,Ramakrishnan R,Livny M.BIRCH: An efficient data clustering method for very large databases[].Proceedings of ACM-SIGMOD International Conference on Management of Data.1996
  • 5Manyika J,Chui M,Brown B,et al.Big data;the next frontier for innovation,competition,and productivity. . 2011
  • 6WU X, ZHU X, WU G, et al. Data mining with big data[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(1): 97-107.
  • 7CHEN M-S, HAN J, YU P S. Data mining: an overview from a database perspective[J]. IEEE Transactions on Knowledge and Data Engineering, 1996, 8(6): 866-883.
  • 8NIMMAGADDA S L, DREHER H. Petro-data cluster mining——knowledge building analysis of complex petroleum systems[C]//ICIT 2009: Proceedings of the 2009 IEEE International Conference on Industrial Technology. Washington, DC: IEEE Computer Society, 2009: 1-8.
  • 9FAHAD A, ALSHATRI N, TARI Z, et al. A survey of clustering algorithms for big data: taxonomy & empirical analysis[J]. IEEE Transactions on Emerging Topics in Computing, 2014, 2(3): 1.
  • 10KURASOVA O, MARCINKEVICIUS V, MEDVEDEV V, et al. Strategies for big data clustering[C]//ICTAI 2014: Proceedings of the IEEE 26th International Conference on Tools with Artificial Intelligence. Piscataway, NJ: IEEE, 2014: 740-747.

共引文献238

同被引文献111

引证文献14

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部