期刊文献+

基于k-means和半监督机制的单类中心学习算法 被引量:4

Algorithm for learning centre of single class based on k-means and semi-supervised mechanism
在线阅读 下载PDF
导出
摘要 提出了一个基于k-means算法框架和半监督机制的single-means算法,以解决单类中心学习问题。k-means算法实质上是对一种混合高斯模型的期望最大化(EM)算法的近似,对该模型随机生成的多类混合数据集,从目标类中随机标定的初始中心出发,能确定地收敛到该类的实际中心。将single-means算法应用到对单类文本中心学习问题中,实验结果表明:在给定目标类中的小标定文本集后,新算法能够有效地改进类的初始中心,且对数据稀疏和方差较大的实际问题具有健壮性。 A new algorithm named "single-means" was presented to improve the centre estimation of the object class when a hybrid data set had unknown k value and feature of accumulating to centre. Based on that k-means algorithm was equivalent to Expectation Maximum (EM) algorithm on a special hybrid Gaussian model, it was proved that given a data set generated by the above Gaussian model, the true centre of the object Gaussian distribution could be converged by a new algorithm. The new algorithm was applied in learning the centre of single text class. The experiment shows that given a small labeled text set, the new algorithm can get a better centre, and is robust on sparse data set and that with great variance.
出处 《计算机应用》 CSCD 北大核心 2008年第10期2513-2516,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(60603027) 天津市应用基础研究计划项目(05YFJMJC11700)
关键词 K-MEANS 单类学习 半监督学习 single—means k-means single class learning semi-supervised learning single-means
作者简介 李志圣(1977-),男,江西上高人,博士研究生,主要研究方向:信息检索;(lzs_jeff@tom.com) 孙越恒(1974-),男,山东烟台人,讲师,主要研究方向:文本分类; 何丕廉(1943-),男,天津人,教授,博士生导师,主要研究方向:人工智能、计算机辅助教育; 侯越先(1972-),男,天津人,副教授,主要研究方向:机器学习、维数约简。
  • 相关文献

参考文献7

  • 1TAX D M J. One-class classification: Concept-learning in the absence of counter-examples [ D]. Delft University of Technology, 2001.
  • 2AGICHTEIN E, GRAVANO L. Snowball: Extracting relations from large plain-text collections[ C]// Proceedings of the 5th ACM International Conference on Digital Libraries. New York, NY, USA: ACM, 2000:85 -94.
  • 3RILOFF E, WIEBE J, WILSON T. Learning subjective nouns using extraciton pattern bootstrapping[ C]// Proceedings of the 7tb CONLL. Morristown, NJ, USA: Association for Computational Linguistics, 2003:25-32.
  • 4LI ZHISHENG, HE PILIAN, SUN YUEHENG. Automatic patterns acquisition and evaluation for Web-based terminology translation [ C]// Proceedings of 2007 International Conference on Machine Leafing and Cybernetics. Hong Kong: [ s. n. ], 2007.
  • 5DUDA R O, HART P E, STORK D G. Patterns Classification[ M]. 2nd ed. New York: John Wiley & Sons, lnc, 2001.
  • 6KEARNS M J, MANSOUR Y, NG A. An information-theoretic analysis of hard and soft assignment methods for clustering[ C]// Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence. San Francisco, CA, USA:Morgan Kaufmarm, 1997:282-293.
  • 7MANNING C D, SCHUTZE H. Foundations of Statistical Natural Language Processing[M]. Cambridge, Mass, USA: MIT Press, 1999.

同被引文献32

  • 1朱颢东,钟勇,赵向辉.一种优化初始中心点的K-Means文本聚类算法[J].郑州大学学报(理学版),2009,41(2):29-32. 被引量:13
  • 2刘大任,孙焕良,牛志成,朱叶丽.一种新的基于密度的聚类与孤立点检测算法[J].沈阳建筑大学学报(自然科学版),2006,22(1):149-153. 被引量:4
  • 3王圆妹.一种改进的K-均值聚类算法的研究[J].长江大学学报(自科版)(上旬),2006,3(4):76-77. 被引量:4
  • 4Wagstaff K, Cardie C, Rogers S, et al. Constrained K-Means Clustering with Background Knowledge[ C] //Brodley CE, Danyluk AP,eds. Proc.of the 18th lnt'l Conf. on Machine Learning. Williamstown: Morgan Kaufmann Publishers, 2001 : 577 - 584.
  • 5Mathias M, Adankon, Mohamed Cheriet. Learning Semi- Supervised SVM with Genetic Algorithm[ C]//Proceedings of International Joint Conference on Neural Networks, 2007:1825 - 1830.
  • 6Noureddine G L, Farid M. Semi-Supervised Muhitemporal Classification with Support Vector Machines and Genetic Algorithms [ C ] // International Geoscience and Remote Sensing Symposium. Spain, 2007 : 2577 - 2580.
  • 7Brian Kulis, Sugato Basu, Inderjit Dhillon, et al. Semi-Supervised Graph Clustering: A Kernel Approach [ J ]. Machine LearnInz, 2009,1 (74) : 1 - 22.
  • 8MacQueen J. Some Methods for Classification and Analysis of Multivariate Observations [ C ]//Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967:281 -297.
  • 9Xu L, Krzyzak A, Oja E. Rival Penalized Competitive Learning for Clustering Analysis, RBFnet and Curve Detection[ J]. IEEE Transactions on Neural Networks, 1993,4(4) :636 - 649.
  • 10Li Kunlun,Zhang Wei, Cao Zheng. A Novel Semi-Supervised SVM Based on Tri-Training[C]//Intelligent Information Technology Application, 2008. IITA '08. Second International Symposium on, 2008,3, (20/22) : 47-51.

引证文献4

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部