Software defect prediction(SDP)is used to perform the statistical analysis of historical defect data to find out the distribution rule of historical defects,so as to effectively predict defects in the new software.How...Software defect prediction(SDP)is used to perform the statistical analysis of historical defect data to find out the distribution rule of historical defects,so as to effectively predict defects in the new software.However,there are redundant and irrelevant features in the software defect datasets affecting the performance of defect predictors.In order to identify and remove the redundant and irrelevant features in software defect datasets,we propose ReliefF-based clustering(RFC),a clusterbased feature selection algorithm.Then,the correlation between features is calculated based on the symmetric uncertainty.According to the correlation degree,RFC partitions features into k clusters based on the k-medoids algorithm,and finally selects the representative features from each cluster to form the final feature subset.In the experiments,we compare the proposed RFC with classical feature selection algorithms on nine National Aeronautics and Space Administration(NASA)software defect prediction datasets in terms of area under curve(AUC)and Fvalue.The experimental results show that RFC can effectively improve the performance of SDP.展开更多
The rapid growth of mobile applications,the popularity of the Android system and its openness have attracted many hackers and even criminals,who are creating lots of Android malware.However,the current methods of Andr...The rapid growth of mobile applications,the popularity of the Android system and its openness have attracted many hackers and even criminals,who are creating lots of Android malware.However,the current methods of Android malware detection need a lot of time in the feature engineering phase.Furthermore,these models have the defects of low detection rate,high complexity,and poor practicability,etc.We analyze the Android malware samples,and the distribution of malware and benign software in application programming interface(API)calls,permissions,and other attributes.We classify the software’s threat levels based on the correlation of features.Then,we propose deep neural networks and convolutional neural networks with ensemble learning(DCEL),a new classifier fusion model for Android malware detection.First,DCEL preprocesses the malware data to remove redundant data,and converts the one-dimensional data into a two-dimensional gray image.Then,the ensemble learning approach is used to combine the deep neural network with the convolutional neural network,and the final classification results are obtained by voting on the prediction of each single classifier.Experiments based on the Drebin and Malgenome datasets show that compared with current state-of-art models,the proposed DCEL has a higher detection rate,higher recall rate,and lower computational cost.展开更多
基金supported by the National Key Research and Development Program of China(2018YFB1003702)the National Natural Science Foundation of China(62072255).
文摘Software defect prediction(SDP)is used to perform the statistical analysis of historical defect data to find out the distribution rule of historical defects,so as to effectively predict defects in the new software.However,there are redundant and irrelevant features in the software defect datasets affecting the performance of defect predictors.In order to identify and remove the redundant and irrelevant features in software defect datasets,we propose ReliefF-based clustering(RFC),a clusterbased feature selection algorithm.Then,the correlation between features is calculated based on the symmetric uncertainty.According to the correlation degree,RFC partitions features into k clusters based on the k-medoids algorithm,and finally selects the representative features from each cluster to form the final feature subset.In the experiments,we compare the proposed RFC with classical feature selection algorithms on nine National Aeronautics and Space Administration(NASA)software defect prediction datasets in terms of area under curve(AUC)and Fvalue.The experimental results show that RFC can effectively improve the performance of SDP.
基金supported by the National Natural Science Foundation of China(62072255)。
文摘The rapid growth of mobile applications,the popularity of the Android system and its openness have attracted many hackers and even criminals,who are creating lots of Android malware.However,the current methods of Android malware detection need a lot of time in the feature engineering phase.Furthermore,these models have the defects of low detection rate,high complexity,and poor practicability,etc.We analyze the Android malware samples,and the distribution of malware and benign software in application programming interface(API)calls,permissions,and other attributes.We classify the software’s threat levels based on the correlation of features.Then,we propose deep neural networks and convolutional neural networks with ensemble learning(DCEL),a new classifier fusion model for Android malware detection.First,DCEL preprocesses the malware data to remove redundant data,and converts the one-dimensional data into a two-dimensional gray image.Then,the ensemble learning approach is used to combine the deep neural network with the convolutional neural network,and the final classification results are obtained by voting on the prediction of each single classifier.Experiments based on the Drebin and Malgenome datasets show that compared with current state-of-art models,the proposed DCEL has a higher detection rate,higher recall rate,and lower computational cost.