期刊文献+

一种基于始末距离的时间序列符号聚合近似表示方法 被引量:9

Symbolic Aggregate Approximation Method of Time Series Based on Beginning and End Distance
在线阅读 下载PDF
导出
摘要 时间序列数据的特征表示方法是时间序列数据挖掘任务的关键技术,符号聚合近似表示(SAX)是特征表示方法中比较常用的一种。针对SAX算法在各序列段表示符号一致时无法区分时间序列间的相似性这一缺陷,提出了一种基于始末距离的时间序列符号聚合近似表示方法(SAX_SM)。由于时间序列有很强的形态趋势,因此文中提出的方法选用起点和终点来表示各个序列段的形态特征,并使用各序列段的形态特征和表示符号来近似表示时间序列数据,以将其从高维空间映射到低维空间;然后,针对起点和终点构建始末距离来计算两序列段间的形态距离;最后,结合始末距离和符号距离定义一种新的距离度量方式,以更客观地度量时间序列间的相似性。理论分析表明,该距离度量满足下界定理。在20组UCR时间序列数据集上的实验表明,所提SAX_SM方法在13个数据集中获得了最高的分类准确率(包含并列最大的),而SAX只在6个数据集中获得了最高的分类准确率(包含并列最大的),因此SAX_SM具有比SAX更优的分类效果。 The feature representation method of time series data is the key technology of time series data mining task,and the symbolic aggregate approximation(SAX)method is most commonly used in feature representation methods.A symbolic aggregate approximation method based on beginning and end distance(SAX_SM)was proposed because SAX algorithm can not distinguish the similarity between time series when the symbol is consistent in each sequence segment of time series.Time series data have a strong morphological trend,so the proposed method uses the beginning point and the end point to represent the morphological feature of each sequence segment,and then uses the morphological feature and representation symbol of each sequence segment to approximate the time series data,in order to map it from highdimensional space to low-dimensional space.Next,in order to calculate the morphological distance between the two sequences,this paper constructed beginning and end distance based on the beginning point and the end point.Finally,to measure the similarity between time series more objectively,a new distance metric approach was defined by combining the beginning and end distance and the symbol distance.The theoretical analysis shows that the new distance measure satisfies the lower bound theorem.Experiments on 20 sets of UCR time series data sets show that the proposed SAX_SM method achieves the highest classification accuracy(including the largest side by side)in 13 data sets,while SAX only gets the largest classification accuracy in 6 data sets(including the largest side by side).Therefore,SAX_SM has better classification result than SAX.
作者 季海娟 周从华 刘志锋 JI Hai-juan;ZHOU Cong-hua;LIU Zhi-feng(School of Computer Science and Telecommunication Engineering,Jiangsu Universit)
出处 《计算机科学》 CSCD 北大核心 2018年第6期216-221,共6页 Computer Science
基金 江苏省重点研发计划(社会发展)项目(BE2016630) 江苏省重点研发计划(社会发展)项目(BE2015617) 江苏省六大人才高峰项目(2014-WLW-012) 无锡市卫计委重点项目(Z201603)资助
关键词 时间序列数据 序列段 始末距离 符号距离 Time series data Sequence segment Beginning and end distance Symbol distance
作者简介 季海娟(1993-),女,硕士生,主要研究方向为大数据技术,EGmail:18260622771@163.com;;周从华(1978-),男,博士,教授,主要研究方向为大数据技术、人工智能,EGmail:zchwyl2003@163.com(通信作者);刘志锋(1981-),男,博士,副教授,主要研究方向为大数据技术.
  • 相关文献

参考文献4

二级参考文献156

  • 1刘世元,江浩.面向相似性搜索的时间序列表示方法述评[J].计算机工程与应用,2004,40(27):53-59. 被引量:14
  • 2李爱国,覃征.在线分割时间序列数据[J].软件学报,2004,15(11):1671-1679. 被引量:27
  • 3肖辉,胡运发.基于分段时间弯曲距离的时间序列挖掘[J].计算机研究与发展,2005,42(1):72-78. 被引量:60
  • 4李爱国,覃征.大规模时间序列数据库降维及相似搜索[J].计算机学报,2005,28(9):1467-1475. 被引量:20
  • 5Daw C S, Finney C E A, Tracy E R. A review of symbolic analysis of experimental data. Review of Scientific Instruments, 2003, 74(2): 915-930
  • 6Kantz H, Schreiber T. Nonlinear Time Series Analysis. 2nd Edition. Cambridge, UK: Cambridge University Press, 2004
  • 7Faloutsos C, Ranganathan M, Manolopoulos Y. Fast subsequence matching in time-series databases//Proceedings of the ACM SIGMOD International Conference on Management of Data. Minneapolis, MN, 1994: 419-429
  • 8Chan K, Fu A W. Efficient time series matching by wavelets//Proceedings of the 15th IEEE International Conference on Data Engineering. Sydney, Australia, 1999:126-133
  • 9Keogh E, Chakrabarti K, Pazzani M, Mehrotra S. Locally adaptive dimensionality reduction for indexing large time series databases//Proeeedings of the ACM SIGMOD Conference on Management of Data. Santa Barbara, CA, 2001: 151-162
  • 10Geurts P. Pattern extraction for time series classification// Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery. Freiburg, Germany, 2001:115-127

共引文献149

同被引文献60

引证文献9

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部