基于k-means和半监督机制的单类中心学习算法被引量：4

Algorithm for learning centre of single class based on k-means and semi-supervised mechanism

在线阅读下载PDF

导出

摘要提出了一个基于k-means算法框架和半监督机制的single-means算法,以解决单类中心学习问题。k-means算法实质上是对一种混合高斯模型的期望最大化(EM)算法的近似,对该模型随机生成的多类混合数据集,从目标类中随机标定的初始中心出发,能确定地收敛到该类的实际中心。将single-means算法应用到对单类文本中心学习问题中,实验结果表明:在给定目标类中的小标定文本集后,新算法能够有效地改进类的初始中心,且对数据稀疏和方差较大的实际问题具有健壮性。 A new algorithm named ＂single-means＂ was presented to improve the centre estimation of the object class when a hybrid data set had unknown k value and feature of accumulating to centre. Based on that k-means algorithm was equivalent to Expectation Maximum （EM） algorithm on a special hybrid Gaussian model, it was proved that given a data set generated by the above Gaussian model, the true centre of the object Gaussian distribution could be converged by a new algorithm. The new algorithm was applied in learning the centre of single text class. The experiment shows that given a small labeled text set, the new algorithm can get a better centre, and is robust on sparse data set and that with great variance.

作者李志圣孙越恒何丕廉侯越先

机构地区天津大学计算机科学与技术学院

出处《计算机应用》 CSCD 北大核心 2008年第10期2513-2516,共4页 journal of Computer Applications

基金国家自然科学基金资助项目(60603027) 天津市应用基础研究计划项目(05YFJMJC11700)

关键词 K-MEANS 单类学习半监督学习 single—means k-means single class learning semi-supervised learning single-means

分类号 TP301 [自动化与计算机技术—计算机系统结构]

作者简介李志圣（1977-），男，江西上高人，博士研究生，主要研究方向：信息检索；（lzs_jeff@tom．com）孙越恒（1974-），男，山东烟台人，讲师，主要研究方向：文本分类；何丕廉（1943-），男，天津人，教授，博士生导师，主要研究方向：人工智能、计算机辅助教育；侯越先（1972-），男，天津人，副教授，主要研究方向：机器学习、维数约简。

引文网络
相关文献

参考文献7

1TAX D M J. One-class classification: Concept-learning in the absence of counter-examples [ D]. Delft University of Technology, 2001.
2AGICHTEIN E, GRAVANO L. Snowball: Extracting relations from large plain-text collections[ C]// Proceedings of the 5th ACM International Conference on Digital Libraries. New York, NY, USA: ACM, 2000:85 -94.
3RILOFF E, WIEBE J, WILSON T. Learning subjective nouns using extraciton pattern bootstrapping[ C]// Proceedings of the 7tb CONLL. Morristown, NJ, USA: Association for Computational Linguistics, 2003:25-32.
4LI ZHISHENG, HE PILIAN, SUN YUEHENG. Automatic patterns acquisition and evaluation for Web-based terminology translation [ C]// Proceedings of 2007 International Conference on Machine Leafing and Cybernetics. Hong Kong: [ s. n. ], 2007.
5DUDA R O, HART P E, STORK D G. Patterns Classification[ M]. 2nd ed. New York: John Wiley & Sons, lnc, 2001.
6KEARNS M J, MANSOUR Y, NG A. An information-theoretic analysis of hard and soft assignment methods for clustering[ C]// Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence. San Francisco, CA, USA:Morgan Kaufmarm, 1997:282-293.
7MANNING C D, SCHUTZE H. Foundations of Statistical Natural Language Processing[M]. Cambridge, Mass, USA: MIT Press, 1999.

同被引文献32

1朱颢东,钟勇,赵向辉.一种优化初始中心点的K-Means文本聚类算法[J].郑州大学学报（理学版）,2009,41(2):29-32. 被引量：13
2刘大任,孙焕良,牛志成,朱叶丽.一种新的基于密度的聚类与孤立点检测算法[J].沈阳建筑大学学报（自然科学版）,2006,22(1):149-153. 被引量：4
3王圆妹.一种改进的K-均值聚类算法的研究[J].长江大学学报（自科版）（上旬）,2006,3(4):76-77. 被引量：4
4Wagstaff K, Cardie C, Rogers S, et al. Constrained K-Means Clustering with Background Knowledge[ C] //Brodley CE, Danyluk AP,eds. Proc.of the 18th lnt'l Conf. on Machine Learning. Williamstown: Morgan Kaufmann Publishers, 2001 : 577 - 584.
5Mathias M, Adankon, Mohamed Cheriet. Learning Semi- Supervised SVM with Genetic Algorithm[ C]//Proceedings of International Joint Conference on Neural Networks, 2007:1825 - 1830.
6Noureddine G L, Farid M. Semi-Supervised Muhitemporal Classification with Support Vector Machines and Genetic Algorithms [ C ] // International Geoscience and Remote Sensing Symposium. Spain, 2007 : 2577 - 2580.
7Brian Kulis, Sugato Basu, Inderjit Dhillon, et al. Semi-Supervised Graph Clustering: A Kernel Approach [ J ]. Machine LearnInz, 2009,1 (74) : 1 - 22.
8MacQueen J. Some Methods for Classification and Analysis of Multivariate Observations [ C ]//Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967:281 -297.
9Xu L, Krzyzak A, Oja E. Rival Penalized Competitive Learning for Clustering Analysis, RBFnet and Curve Detection[ J]. IEEE Transactions on Neural Networks, 1993,4(4) :636 - 649.
10Li Kunlun,Zhang Wei, Cao Zheng. A Novel Semi-Supervised SVM Based on Tri-Training[C]//Intelligent Information Technology Application, 2008. IITA '08. Second International Symposium on, 2008,3, (20/22) : 47-51.

引证文献4

1孙雪,李昆仑,胡夕坤,赵瑞.基于半监督K-means的K值全局寻优算法[J].北京交通大学学报,2009,33(6):106-109. 被引量：11
2袁利永.基于不完备标签数据的半监督聚类算法[J].计算机系统应用,2011,20(2):182-185.
3李小展.基于半监督的K-means聚类改进算法[J].东莞理工学院学报,2011,18(1):29-32. 被引量：1
4刘明术.基于K-均值聚类的混合聚类算法[J].安庆师范学院学报（自然科学版）,2016,22(1):40-42. 被引量：3

二级引证文献15

1田森平,吴文亮.自动获取k-means聚类参数k值的算法[J].计算机工程与设计,2011,32(1):274-276. 被引量：18
2袁利永,王基一.一种改进的半监督K-Means聚类算法[J].计算机工程与科学,2011,33(6):138-143. 被引量：13
3张广斌,束洪春,于继来.利用广义电流模量的行波实测数据半监督聚类筛选[J].中国电机工程学报,2012,32(10):150-159. 被引量：14
4李翔宇,王开军,郭躬德.挑选聚类算法的网格连通图方法[J].计算机系统应用,2012,21(9):103-107.
5冯波,郝文宁,陈刚,占栋辉.K-means算法初始聚类中心选择的优化[J].计算机工程与应用,2013,49(14):182-185. 被引量：51
6李丹丹,刘锐,陈动.基于空间聚类分析的中国省域能源消费碳排放分布特征研究[J].北京师范大学学报（自然科学版）,2013,49(5):529-533. 被引量：5
7李卫军.K-means聚类算法的研究综述[J].现代计算机（中旬刊）,2014(8):31-32. 被引量：10
8张慧,张雅琼,林基艳,张永恒.基于K-ABC的无线传感网络路由算法[J].河南科学,2016,34(8):1232-1236.
9张秀玲,齐晴,侯代标,程艳涛,付栋.混合优化的RBF网络车牌字符识别[J].沈阳大学学报（自然科学版）,2017,29(2):113-117. 被引量：3
10张斌.基于回声状态网络的短期股价预测模型[J].计算机应用与软件,2017,34(5):268-272. 被引量：10

1施海滨,周勇.混合聚类彩色图像分割方法研究[J].计算机工程与应用,2011,47(9):181-184. 被引量：8
2江涛,张传霞.城市扩展动态变化的遥感研究[J].遥感信息,1999,21(4):50-53. 被引量：27
3胡波,朱谷昌,张远飞,冷超.基于高斯混合模型的遥感信息提取方法研究[J].国土资源遥感,2012,24(4):41-47.
4吴陈,汤莹.基于选择迁移的bagging文本分类算法[J].计算机工程与设计,2015,36(7):1808-1812. 被引量：4
5张少中,章锦文,张志勇,韩美君,王秀坤.面向大规模数据集的贝叶斯网络参数学习算法[J].计算机应用,2006,26(7):1689-1691. 被引量：5
6吴奎,宋彦,戴礼荣.基于CUDA的GMM模型快速训练方法[J].数据采集与处理,2012,27(1):85-90. 被引量：3
7刘毅.一种图像局部特征的语义提取方法[J].计算机工程与科学,2010,32(6):61-64. 被引量：1
8马恒,丁世飞.一种基于混合数据相似性度量的谱聚类算法[J].小型微型计算机系统,2016,37(8):1746-1750. 被引量：4
9陈新泉.面向混合属性数据集的双重聚类方法[J].计算机工程与科学,2013,35(2):127-132. 被引量：2
10俞奎,王浩,吴信东,姚宏亮.贝叶斯网络的并行EM学习算法[J].模式识别与人工智能,2008,21(5):670-676. 被引量：2

计算机应用

2008年第10期

浏览历史

内容加载中请稍等...

基于k-means和半监督机制的单类中心学习算法被引量：4

参考文献7

同被引文献32

引证文献4

二级引证文献15

相关作者

相关机构

相关主题

浏览历史

基于k-means和半监督机制的单类中心学习算法 被引量：4

参考文献7

同被引文献32

引证文献4

二级引证文献15

相关作者

相关机构

相关主题

浏览历史

基于k-means和半监督机制的单类中心学习算法被引量：4