摘要
测试数据集的质量对入侵检测系统的性能起着至关重要的作用,在保证质量的前提下对入侵检测数据集优化降维,是提高入侵检测系统高效准确运转的重要措施。使用K近邻、决策树、随机森林和Softmax分类算法,对CSE-CIC-IDS2018入侵检测数据集进行特征维数探究,按照特征重要性评分对分类器进行特征递减式训练,分析机器学习分类器对该数据集的特征维数依赖关系。结果表明,数据集的特征数量由83个减少至最低7~9个时,分类器仍可以保持较高的分类性能,且检测时间显著减少,计算效率更高。
The quality of the test data set plays a vital role in the performance of the intrusion detection system.On the premise of quality assurance,reducing the dimension reduction optimization of the intrusion detection data set is an important measure to improve the efficient and accurate operation of the intrusion detection system.K-Nearest Neighbor,decision tree,random forest and Softmax classification algorithms are used,to explore the feature dimension of CSE-CIC-IDS2018 intrusion detection data set.According to the feature importance score,the classifier is trained in a feature decreasing way,and the dependency of machine learning classifier on the feature dimension of the data set is analyzed.The results show that when the number of features is reduced from 83 to the minimum 7~9,the classifier can still maintain higher classification performance,and the detection time is significantly reduced,and the calculation efficiency is much higher.
作者
刘江豪
张安琳
黄子奇
黄道颖
陈孝文
LIU Jiang-hao;ZHANG An-lin;HUANG Zi-qi;HUANG Dao-ying;CHEN Xiao-wen(College of Computer and Communication Engineering,Zhengzhou University of Light Industry,Zhengzhou 450000,China;North Information Control Reseasrch Academy Group Co.,Ltd.,Nanjing 211153,China;Engineering Training Center,Zhengzhou University of Light Industry,Zhengzhou 450000,China)
出处
《火力与指挥控制》
CSCD
北大核心
2021年第7期155-162,共8页
Fire Control & Command Control
基金
国家科技支撑计划基金资助项目(2006BAK01A38)。
关键词
入侵检测
数据集
机器学习
分类器
降维
intrusion detection
data set
machine learning
classifier
dimension reduction
作者简介
刘江豪(1996-),男,河南内黄人,硕士研究生。研究方向:计算机网络、智能算法;通信作者:黄子奇(1996-),男,河南郑州人,硕士。研究方向:计算机网络;通信作者:黄道颖(1967-),男,河南信阳人,博士,教授。研究方向:计算机网络、智能算法、分布式计算。