摘要
针对目前的曲线聚类算法基于类内差异设计,造成不同类之间的曲线区分度不高的问题。在曲线拟合、曲线距离界定的基础上,构造新的目标函数,提出同时考虑类内和类间差异的曲线聚类算法。模拟结果显示,该曲线聚类能够提高聚类精度;针对NO_2污染物小时浓度的实例分析表明,该曲线聚类算法具有更好的类间区分度。
With the improvement of accuracy and frequency of data collection,functional data has appeared.Curves’clustering is a fundamental exploratory task in functional data analysis,and To sovave currently curves clustering algorithms available are based on the differences within each cluster,which has resulted in a low distinction among different curves.Therefore,on the base of curve fitting and curve distance,and with constructed objective function,curves clustering algorithms will be put forward with the consideration of cluster differences.Simulated results show that the curve cluster improves clustering accuracy.The example analysis of hourly NO2 concentration(μg/m3)indicates that this kind of curves clustering algorithms has a better distinction among different clusters.
作者
许腾腾
王瑞
黄恒君
XU Tengteng;WANG Rui;HUANG Hengjun(School of Statistics,Lanzhou University of Finance and Economics,Lanzhou 730020,China)
出处
《智能系统学报》
CSCD
北大核心
2019年第2期362-368,共7页
CAAI Transactions on Intelligent Systems
基金
国家社科基金青年项目(14CTJ009
15CTJ004)
全国统计科学研究重点项目(2017LZ43)
陇原青年创新人才扶持计划项目(14GSD95)
关键词
函数型数据
类间差异
曲线聚类
B-样条
距离度量
functional data
differences among clusters
curve clustering
B-spline
distance metric
作者简介
通信作者:黄恒君,男,1981年生,教授,博士,主要研究方向为异源异构数据整合与函数型数据分析。主持国家社会科学基金项目1项,获得省部级科研奖励4项。发表学术论文30余篇.E-mail:noahwong@163.com;许腾腾,男,1992年生,硕士研究生,主要研究方向为异源异构数据整合与函数型数据分析;王瑞,女,1993年生,硕士研究生,主要研究方向为经济统计。