摘要
传统的钓鱼网站检测技术主要采用随机或者凭经验选取敏感特征项用于检测的方法,无法保证检测的准确性。为此,提出一种面向钓鱼网站敏感特征选取的改进的信息增益算法IIGAIN(Improved Information Gain Algorithm)。该算法综合考虑了特征项的类内离散度,通过对特征项的类内离散度差值做相应的处理,以处理后的结果作为惩罚项改进信息增益算法。实验结果表明,利用IIGAIN进行特征项选取的钓鱼网站检测方法的检测准确性明显优于随机选取特征项的钓鱼网站检测方法。
Traditional detection technique for fishing websites mainly employs the means of random or empirical sensitive feature items selection in detection,it cannot guarantee the detection accuracy. Therefore,in this paper we propose an improved information gain algorithm( IIGAIN) which is oriented to fishing website sensitive feature items selection. The algorithm comprehensively considers the within-class dispersion of feature items,by processing correspondingly the difference of within-class dispersion of feature items,it uses the result obtained after processing as the penalty item to improve the information gain algorithm. Experimental result shows that the fishing websites detection method using IIGAIN for feature items selection has conspicuous superiority in accuracy of detection than the fishing websites detection method based on random feature item selection algorithm.
出处
《计算机应用与软件》
CSCD
2016年第4期297-301,共5页
Computer Applications and Software
基金
北京市教委科技重点项目(KZ201411232036)
关键词
钓鱼网站检测
敏感特征项
信息增益
类内离散度
Fishing websites detection
Sensitivity feature item
Information gain
Within-class dispersion
作者简介
王燕,硕士生,主研领域:网络安全。
王兴芬,教授。
任俊玲,副教授。