期刊文献+

文本分类中基于核的非线性判别 被引量:4

Kernel-Based Nonlinear Discriminant Method in Text Classification
在线阅读 下载PDF
导出
摘要 针对文本分类问题中的特征降维问题,改进最大散度差鉴别准则,引入核变换作为前处理,使最大散度差鉴别准则可适用于更广泛的文本分类情形.提出一种基于核的非线性鉴别方法用于文本特征抽取.借助于核变换解决了散度差准则在用于文本分类时线性可分性较差的问题.在最低限度减少信息损失的前提下实现了特征维数的大幅度减缩.文本分类试验结果表明,这种非线性方法与无核的最大散度差方法相比,F1值提高了4.7%,具有明显的效率上的优势. To achieve feature reduction in text categorization, the scatter difference criterion is improved to satisfy a broad range of text categorization problems using kernel commutation in the pre-treatment. A kernel-based nonlinear method is proposed to extract features. By kernel commutation, the stylebook categorization problem is solved with less linear separability. Dimension of the feature space is significantly reduced without incurring excessive information loss. Experiments show that performance of the proposed method is better than maximal scatter difference with an efficiency improvement of 4.7 % for the value of F1.
出处 《应用科学学报》 CAS CSCD 北大核心 2008年第6期627-631,共5页 Journal of Applied Sciences
基金 国家自然科学基金资助项目(No.70571087)
关键词 文本分类 特征抽取 散度差 核变换 text categorization, feature extraction, scatter difference, kernel commutation
作者简介 刘海峰,博士生,副教授,研究方向:文本挖掘、统计分析,E—mail:liuhaifeng19620717@sina.com
  • 相关文献

参考文献12

  • 1宋枫溪,刘树海,杨静宇,夏赛飞.最大散度差分类器及其在文本分类中的应用[J].计算机工程,2005,31(5):8-10. 被引量:8
  • 2陈伏兵,张生亮,高秀梅,杨静宇.小样本情况下Fisher线性鉴别分析的理论及其验证[J].中国图象图形学报,2005,10(8):984-991. 被引量:17
  • 3DUDA R O, HART P E, STORK D G. Pattern classification[M].李宏东,姚天翔,译.Beijing:China Machine Press,2003.
  • 4JIN Zhong, YANG Jingyu, HU Zhongshan, LOU Zhen. Face recognition based on uncorrelated discriminant transformation [J]. Pattern Recognition, 2001, 34 (7): 1405 - 1416.
  • 5HONG Ziquan, YANG Jingyu. Optimal discriminant plane for a small number of samples and design method of classifier on the plane [ J]. Pattern Recognition, 1991,24 (4) : 317 -324.
  • 6CHEN Lifen, LIAO H Y M, KO M T, LIN J C, YU G J. A new LDA-based face recognition system which can solve the small sample size problem [ J ]. Pattern Recognition, 2000, 33(10) : 1713 -1726.
  • 7宋枫溪,程科,杨静宇,刘树海.最大散度差和大间距线性投影与支持向量机[J].自动化学报,2004,30(6):890-896. 被引量:58
  • 8宋枫溪,杨静宇,刘树海,张大鹏.基于多类最大散度差的人脸表示方法[J].自动化学报,2006,32(3):378-385. 被引量:17
  • 9LI Haifeng, JIANG Tao, ZHANG Keshu. Efficient and robust feature extraction by maximum margin criterion [ C ]// Proceedings of Advances in Neural Information Processing Systems. [ s. l. ] : MIT Press, 2004, 16 : 97 - 104.
  • 10YANG Yiming, LIU Xin. A re-examination of text categorization methods [ C ]//Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR), 1999 : 42 - 49.

二级参考文献51

  • 1杨健,杨静宇,叶晖.Fisher线性鉴别分析的理论研究及其应用[J].自动化学报,2003,29(4):481-493. 被引量:97
  • 2宋枫溪,陈才扣,刘树海,杨静宇.文本表示方式对线性支持向量机分类性能的影响[J].模式识别与人工智能,2004,17(2):161-166. 被引量:4
  • 3宋枫溪,程科,杨静宇,刘树海.最大散度差和大间距线性投影与支持向量机[J].自动化学报,2004,30(6):890-896. 被引量:58
  • 4Duda R, Hart P. Pattern Classification and Scene Analysis [M].New York: Wiley, 1973:113 -120.
  • 5Sammon J W. An optimal discriminant plane[ J]. IEEE Transactions on Computer, 1970,19:826 - 829.
  • 6Foley D H, Sammon J W Jr. An optimal set of discriminant vectors[J]. IEEE Transactions on Computer, 1975, 24(3): 281 -289.
  • 7Duchene J, Leclercq S. An optimal transformation for discriminant and principal component analysis [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1988, 10(6) : 978 -983.
  • 8Jin Zhong, Yang J Y, Hu Z S, et al. Face recognition based on uncorrelated discriminant transformation [J]. Pattern Recognition,2001,34(7): 1405-1416.
  • 9Jin Z, Yang J Y, Tang Z M, et al. A theorem on uncorrelated optimal discriminant vectors [J]. Pattern Recognition, 2001,34(10) :2041 -2047.
  • 10Belhumeur Peter N, Hespanha Joao P, Kriegman David J, et al.Eigenfaces vs. Fisherfaces: recognition using class specific linear projection [J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 1997, 19(7) : 711 - 720.

共引文献83

同被引文献41

引证文献4

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部