摘要
为了解决传统聚类方法处理高维稀疏数据对象时聚类结果不理想的问题,提出了SS/OSF聚类方法.该方法基于对象组相似度(SS)和对象组特征向量(OSF),并借助对象组特征向量的可加性实现.采用本方法得到高维稀疏数据对象的聚类结果后,可以根据聚类结果中各个对象集合的上确界和下确界为新对象进行对象组分类.实验表明,与传统K-means聚类方法相比,随着数据对象数目的增加,该方法无论是在运行时间上,还是在聚类结果的准确度方面都有明显的改进.
Results of clustering are generally not ideal with traditional clustering method. Thus a SS/OSF clustering method is proposed for high-dimensional sparse data object based on set similarity (SS) and object set feature (OSF) with the addability of object set features. After the object clusters are gained by the SS/OSF clustering method, and according to the supremum and infimum of object clustering set, the new object can be distributed to all kinds of different clusters. Compared with the traditional K-means clustering method, the test results show that, as the number of object increases, the runtime and precision of results of the SS/OSF clustering method are seen to be clearly improved.
出处
《北京理工大学学报》
EI
CAS
CSCD
北大核心
2006年第3期216-220,共5页
Transactions of Beijing Institute of Technology
基金
霍英东教育基金资助项目(91101)
科技部基础性工作专项资金资助项目(2002DEA20018)
关键词
高维稀疏二态数据
对象组相似度
对象组特征向量
聚类
分类
high-dimensional sparse binary data
set similarity
object set feature
clustering
classification
作者简介
吴萍(1972-),女,在职博士生,E-mail:wuping@bit.edu.cn;
宋瀚涛(1940-),男,教授,博士生导师.