期刊文献+

一种基于关联性的特征选择算法 被引量:2

A FEATURE SELECTION ALGORITHM BASED ON CORRELATION
在线阅读 下载PDF
导出
摘要 目前在文本分类领域较常用到的特征选择算法中,仅仅考虑了特征与类别之间的关联性,而对特征与特征之间的关联性没有予以足够的重视。提出一种新的基于关联分析的特征选择算法,该方法以信息论量度为基本工具,综合考虑了计算代价以及特征评估的客观性等问题。算法在保留类别相关特征的同时识别并摒弃了冗余特征,取得了较好的约简效果。 Current feature selection algorithms frequently used in text categorization merely take the correlation between feature and class into account but pay less attention to correlation between the features. A new algorithm based on correlation analysis was put forward, which takes the measurement of information theory as the basic tool, and considers some issues such as computing cost and objectivity in feature assessment, etc. , comprehensively. The algorithm has abandoned the redundant feature while maintained the category correlated features, and achieved good results in reduction.
出处 《计算机应用与软件》 CSCD 2009年第8期259-261,共3页 Computer Applications and Software
关键词 特征选择 文本分类 特征关联 Feature selection Text categorization Feature correlation
作者简介 王卫玲,硕士,主研领域:Web挖掘,信息检索,信息过滤。
  • 相关文献

参考文献8

  • 1陈彬,洪家荣,王亚东.最优特征子集选择问题[J].计算机学报,1997,20(2):133-138. 被引量:96
  • 2Yu L,Liu H.Feature Selection for high-dimensional data:a fast correlation-based filter solution[R].In Proceedings of the twentieth International Conference on Machine Learning,2003:856-863.
  • 3Lei Yu,Huan Liu.Efficient Feature Selection via Analysis of Relevance and Redundancy[J].Journal of Machine Research,2004(5):1205-1224.
  • 4Guyon I,Elisseeff A.An introduction to variable and feature selection[J].Journal of Machine Learning Research,2003(3):1157-1182.
  • 5Yi Wang,XiaoJing Wang.A New Approach to Feature Selection in Text Classification[R].Proceeding of the Fourth International Conference on Machine Learning and Cybernetics,Guangzhou,2005:18-21.
  • 6Fengxi Song,Shuhai Liu.A Comparative Study on Text Representation Schemes in Text Categorization[J].Pattern Anal Applic,2007.
  • 7Yu L,Liu H.FCBF-Feature Selection for High-Dimensional Data[C]//In Proceedings of the twentieth International Conference on Machine Learning,Washington DC,USA,2003:856-863.
  • 8Makrehchi M,MS Kamel.Text Classification Using Small Number of Features[C]//Proc.of the 4th Int'l Conf.on Machine Learning and Data Mining,2005:580-589.

二级参考文献3

  • 1Wu X,A Heuristic Covering Algorithm for Extension Matrix Approach.Department of Artificial Intelligence,1992年
  • 2洪家荣,Proc Int Computer Science Conference’88, Hong Kong,1988年
  • 3洪家荣,Int Jnal of Computer and Information Science,1985年,14卷,6期,421页

共引文献95

同被引文献19

  • 1陈莉,焦李成.基于关系代数的关联规则挖掘算法[J].西北大学学报(自然科学版),2005,35(6):691-694. 被引量:16
  • 2杨彦闯,杨炳儒,张克君.基于联合提取特征的粗糙集文本分类技术研究[J].计算机应用研究,2007,24(7):97-98. 被引量:4
  • 3谭松波.高性能文本分类算法研究[D].北京:中国科学院计算技术研究所,2005.
  • 4YU Lei, LIU Huan. FCBF-feature selection for high-dimensional data [C]//Proc of the 20th International Conference on Machine Learn- ing. 2003 : 856- 863.
  • 5MAKREHCHI M,KAMEL M S. Text classification using small num- ber of features[ C]//Proc of the 4th International Conference on Ma- chine Learning and Data Mining. Berlin: Springer-Verlag,2005:580- 589.
  • 6YANG Yi-ming, LIU Xin. A re-examination of text categorization methods [ C ]//Proc of SIGIR' 99. New York : ACM, 1999:42-49.
  • 7ZHANG H. The optimality of naive Bayes[ C]//Proc of the 17th In- ternational FLAIRS Conference. 2004.
  • 8YANG Yi-ming. An evaluation of statistical approaches to text categori- zation[J]. Journal of Information Retrieval,1999,1 (1/2): 67-88.
  • 9刘海峰,王元元,姚泽清,张述祖.文本分类中一种混合型特征降维方法[J].计算机工程,2009,35(2):194-196. 被引量:11
  • 10徐海鹏.基于关联规则的股票预测方法研究[J].计算机与数字工程,2010,38(3):150-153. 被引量:5

引证文献2

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部