期刊文献+

基于CPD-SMOTE的类不平衡数据分类算法研究 被引量:7

CLASS IMBALANCE DATA CLASSIFICATION ALGORITHM BASED ON CPD-SMOTE
在线阅读 下载PDF
导出
摘要 类不平衡现象普遍存在于不同应用领域中,如金融欺诈、网络入侵、垃圾邮件过滤、医学检测,直接采用传统的学习分类算法,分类准确率较低。针对类不平衡情况对分类器的影响,基于传统过采样算法SMOTE(Synthetic Minority Oversampling Technique)算法处理类不平衡的有效性,致力进一步提升SMOTE算法性能,提出一种面向类不平衡数据集分类的改进型SMOTE算法——CPD-SMOTE算法。通过考虑训练集小样本的特征、位置及其周围样本分布,来确定小样本的强相关邻居集,以此作为SMOTE最近邻居集,产生新的小样本。实验结果表明,CPD-SMOTE算法在处理不平衡数据集上相比SMOTE、Borderline-SMOTE、ADASYN、LN-SMOTE等算法有所提高。 Class imbalance is a common phenomenon existing in different applications, such as financial fraud, network intrusion, spam filtering and medical detection. If we directly adopt the traditional learning classification algorithm, classification accuracy is low. Aiming at the effect of class imbalance on classifier, this paper proposed an improved SMOTE algorithm, CPD-SMOTE algorithm, which was oriented to the classification of class imbalance datasets. Based on the effectiveness of traditional over-sampling algorithm SMOTE to deal with class imbalance, CPD-SMOTE algorithm was engaged in further improving the performance of SMOTE algorithm. CPD-SMOTE algorithm determined the strong correlation neighborhood set of small samples by considering the characteristics and location of small samples and distribution of their surrounding samples in the training set. It was used as the nearest neighbor set of SMOTE to generate new small samples. Experimental results show that CPD-SMOTE algorithm is better than SMOTE, Borderline-SMOTE, ADASYN and LN-SMOTE in dealing with imbalanced datasets.
作者 彭如香 杨涛 孔华锋 姜国庆 凡友荣 Peng Ruxiang;Yang Tao;Kong Huafeng;Jiang Guoqing;Fan Yourong(Third Research Institute of Ministry of Public Security,Shanghai 210204,China;Key Lab of Information Network Security,Shanghai 201204,China)
出处 《计算机应用与软件》 北大核心 2018年第12期259-262,268,共5页 Computer Applications and Software
基金 国家重点研发计划课题(2016YFC0800909) 公安部科技强警基础工作专项项目(2018GBJC19) 上海市科委科研项目(17DZ1101004)
关键词 SMOTE 类不平衡 分类算法 SMOTE Class imbalance Classification algorithm
作者简介 彭如香 ,助理研究员,主研领域:信息安全与数据挖掘。;杨涛 ,副研究员。;孔华锋 ,研究员。;姜国庆 ,助理研究员。;凡友荣 ,助理研究员。
  • 相关文献

同被引文献71

引证文献7

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部