期刊文献+

一种基于邻域粗糙集的多标记专属特征选择方法 被引量:15

Multi-label-specific Feature Selection Method Based on Neighborhood Rough Set
在线阅读 下载PDF
导出
摘要 在多标记学习中,数据降维是一项重要且具有挑战性的任务,而特征选择又是一种高效的数据降维技术。在邻域粗糙集理论的基础上提出一种多标记专属特征选择方法,该方法从理论上确保了所得到的专属特征与相应标记具有较强的相关性,进而改善了约简效果。首先,该方法运用粗糙集理论的约简算法来减少冗余属性,在保持分类能力不变的情况下获得标记的专属特征;然后,在邻域精确度和邻域粗糙度概念的基础上,重新定义了基于邻域粗糙集的依赖度与重要度的计算方法,探讨了该模型的相关性质;最后,构建了一种基于邻域粗糙集的多标记专属特征选择模型,实现了多标记分类任务的特征选择算法。在多个公开的数据集上进行仿真实验,结果表明了该算法是有效的。 Dimensionality reduction of data is a significant and challenging task under multi-label learning,and feature selection is a valid technology to reduce the dimension of vector.In this paper,a multi-label-specific feature selection method based on neighborhood rough set theory was proposed.This method ensures theoretically that there exists a strong correlation between the obtained label-specific features and the corresponding labels,and then reduction efficiency can be improved well.Firstly,a reduction algorithm of rough set theory is applied to reduce redundant attributes,and the label-specific features are obtained while keeping the classification ability unchanged.Then,the concepts of neighborhood accuracy and neighborhood roughness are introduced,the calculation approaches to dependence and attribute significance based on neighborhood rough set are redefined,and the related properties of this model are discussed.Finally,a multi-label-specific feature selection model based on neighborhood rough set is presented,and the corresponding feature selection algorithm for multi-label classification task is designed.The experimental results under some public datasets demonstrate the effectiveness of the proposed multi-label-specific feature selection method.
出处 《计算机科学》 CSCD 北大核心 2018年第1期173-178,共6页 Computer Science
基金 国家自然科学基金项目(61772176 61402153 61370169 61602158) 中国博士后科学(2016M602247) 河南省科技攻关项目(162102210261) 新乡市科技攻关计划项目(CXGG17002) 河南师范大学博士科研启动费支持课题(qd15132)资助
关键词 多标记学习 邻域粗糙集 专属特征 特征选择 Multi-label learning Neighborhood rough set Label-specific feature Feature selection
作者简介 孙林(1979-),男,博士,副教授,CCF会员,主要研究方向为粒计算、数据挖掘、生物信息学等,E-mail:sunlin@htu.edu.cn(通信作者);潘俊方(1994-)女,硕士生,主要研究方向为多标记学习、数据挖掘等;;张霄雨(1993-),女,硕士生,主要研究方向为粒计算;;王伟(1975-),男,博士,讲师,主要研究方向为生物信息学;;徐久成(1964-),男,博士,教授,CCF高级会员,主要研究方向为粒计算、数据挖掘、生物信息学等.
  • 相关文献

参考文献8

二级参考文献124

  • 1杨涛,骆嘉伟,王艳,吴君浩.基于马氏距离的缺失值填充算法[J].计算机应用,2005,25(12):2868-2871. 被引量:24
  • 2徐章艳,刘作鹏,杨炳儒,宋威.一个复杂度为max(O(|C||U|),O(|C^2|U/C|))的快速属性约简算法[J].计算机学报,2006,29(3):391-399. 被引量:234
  • 3李丹,李国正,陆文聪.用于药物活性预报的Co-Training方法[J].计算机科学,2006,33(12):159-161. 被引量:3
  • 4李敏.直觉模糊集的截集[J].辽宁师范大学学报(自然科学版),2007,30(2):152-154. 被引量:26
  • 5Wilson D R, Martinez T R. Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research, 1997, 6( 1 ) : 1 - 34
  • 6Hu Qinghua, Yu Daren, Xie Zongxia. Neighborhood Classifiers. Expert Systems with Applications: An International Journal, 2008, 34 (2) : 866 - 876
  • 7Schapire R E, Singer Y. Boostexter: A boosting-based system for text categorization. Machine Learning, 2000, 39 (2--3):135-168.
  • 8McCallum A. Multi-label text classification with a mixture model trained by EM. Working Notes of the AAAI' 99 Workshop on Text Learning. Orlando: AAAI, 1999.
  • 9Boutell M R, Luo J, Shen X, et al. Learning multi-label scene classification. Pattern Recognition, 2004, 37(9): 1757-1771.
  • 10Yin Z, Zhou Z H. Multi-label dimensionality reduction via dependency maximization. Proceedings of the 23^rd AAAI Conference on Artificial Intelligence, Chicago, IL: AAAI, 2008, 1503-1505.

共引文献273

同被引文献181

引证文献15

二级引证文献83

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部