摘要
针对分类决策树算法存在的结构冗余及误差迁移问题,提出了软聚类节点分裂层次模型.通过叶子节点处决策模型构建以及软聚类节点分裂方法,实现对样本空间的高效划分,生成精简的层次结构模型.利用层次结构判别方法,从层次结构模型叶子节点到根节点对样本进行加权求和预测,降低模型结构对判定效果的影响,提高模型对判别误差的调节能力.对比了CART、ID3、C4.5共3种分类算法,该方法构建的模型结构简单,在两个数据集上均有最好的分类效果,F1-measure分别为0.53和0.38.说明软聚类节点分裂层次模型能够避免冗余结构,缓解误差迁移问题.
Aiming at the structural redundancy and error migration existing in the classification decision tree algorithm,a soft clustering node split hierarchical model was proposed.Through the decision-making model at the leaf nodes and the method of splitting nodes by soft clustering,the efficient partitioning of the sample space was realized,and a simplified hierarchical model was generated.Using the hierarchical discriminant method,samples were predicted with weighted summation methods from the leaf nodes to the root node of hierarchical structure model to reduce the effect of model structure on classification performance,and to improve the model’s ability in discriminant errors adjustment.Compared with CART,ID3 and C4.5,the model proposed by the method is simple and showes the best classification performance on two data sets,F1-measure is 0.53 and 0.38 respectively.The experimental result indicates the soft clustering node split hierarchical model can avoid the redundant structure and alleviate the problem of error migration.
作者
罗森林
孙志鹏
潘丽敏
LUO Sen-lin;SUN Zhi-peng;PAN Li-min(School of Information and Electronics,Beijing Institute of Technology,Beijing 100081,China)
出处
《北京理工大学学报》
EI
CAS
CSCD
北大核心
2020年第3期305-309,共5页
Transactions of Beijing Institute of Technology
基金
国家“十二五”科技支撑计划课题(2012BA110B01)
国家卫生部行业科研专项基金项目(201302008)。
关键词
分类决策树
层次结构判别方法
软聚类
层次模型
classification decision tree
hierarchical discriminant method
soft clustering
hierarchical mode
作者简介
罗森林(1968—),博士,教授,E-mail:luosenlin@bit.edu.cn.