摘要
当前数据库信息分类筛查方法准确率较低、丢包率较高,提出基于PSO和DNN的文档数据库结构信息分类筛查方法。利用SNR特征基因选择法对文档数据库结构信息特征属性进行降维,采用OCDD算法对连续型的数据信息进行离散化处理。将数据库结构信息某类别标签和属性变量的互信息与信息熵之间比值当作信息离散化目标函数,依据动态迭代规划法对目标函数进行求解,得到最优离散化分区。在自动编码机顶层设计一个Softmax信息分类器,通过粒子群算法对自动编码机权值进行优化。Softmax信息分类器在编码结束后根据梯度下降法对自身进行调整,同时将分类器代价函数当作调整编码机权值过程中一个评估值,共同和编码机误差函数指导权值优化。利用优化后的权值和Softmax信息分类器完成文档数据库结构信息分类筛查。实验结果表明,该方法平均丢包率为0.28%,分类筛查准确性良好,具备可靠性能。
This article puts forward a method to classify and screen the document database structure information based on PSO and DNN.Firstly,we used SNR feature gene selection method to reduce the dimensionality of feature attribute of document database structure information,and then we used OCDD algorithm to discretize the continuous data information.Secondly,we regarded the ratio between the mutual information and the information entropy of a category label and attribute variable of database structure information and as the discretization objective function.Thirdly,we solved the objective function based on the dynamic iterative programming method to obtain the optimal discretization subarea.Moreover,we designed a Softmax information classifier on the top of automatic encoder and optimized the weight of automatic encoder by particle swarm optimization.After the end of encode,Softmax information classifier adjusted itself according to the gradient descent method.Meanwhile,we regarded the classifier cost function as an evaluation value in the process of adjusting the weight of encoder,which was used to guide the weight optimization with the error function of encoder.Finally,we used the optimized weight and Softmax information classifier to complete the classification and screening of document database structure information.Simulation results show that the average packet loss rate of the proposed method is 0.28%.Meanwhile,the accuracy of classification screening is good and reliable.
作者
王健
WANG Jian(College of Computer Science andlnformation Technology,Daqing Normal University,Daqing Heilongjiang 163712,China)
出处
《计算机仿真》
北大核心
2019年第5期417-420,444,共5页
Computer Simulation
关键词
文档数据库
信息
分类
筛查
Document database
Information
Classification
Screening
作者简介
王健(1971-),男(汉族),黑龙江龙江县人,硕士研究生,副教授,研究方向:数据库技术及其应用。