期刊文献+

面向不均衡数据集的在线式异质媒体网络事件发现 被引量:3

Heterogeneous Media Online Web Event Detection for Imbalanced Datasets
在线阅读 下载PDF
导出
摘要 随着互联网的发展,网络数据呈现出异质数据多、文本标签化、数据不均衡等特点,这使得传统的基于长文本在线式网络事件的方法逐渐失效。采用改进的Single Pass方法进行在线式异质媒体网络事件发现:首先,通过分析网络数据中的不均衡性,重新设计相似度计算公式;其次,设计滑动时间窗口来提高Single Pass的算法效率;最后在Flickr的SED2014数据集上开展实验。实验结果表明,提出的算法具有有效性和实用性。 With the development of Internet,the web data has present the characteristics of heterogeneous,text tagging and imbalanced data,which leads to the failure of the traditional online event detection method based on long text. The improved Single Pass Algorithm was adopted to detect online heterogeneous media web events. On one hand,the similarity calculation formula based on the imbalanced data was redesigned. On the other hand,the slice-windows to improve single pass algorithm runtime was designed. The result on SED2014 dataset shows the effectiveness and practicality of algorithm.
出处 《科学技术与工程》 北大核心 2016年第16期227-232,共6页 Science Technology and Engineering
基金 国家自然科学基金项目重点项目(613300194) 河南省科技计划项目(142300410044) 河南省教育厅科学技术研究重点项目(14A520057 15B520022) 河南省基础与前沿技术研究项目(142300410396) 南阳师范学院校级项目(QN2015025)资助
关键词 在线式 网络事件发现 单遍聚类 异质媒体 online web events detection single pass heterogeneous media
作者简介 赵学武(1983-),男,讲师,博士研究生。研究方向:机器学习、网络应用。
  • 相关文献

参考文献11

二级参考文献42

  • 1徐燕,李锦涛,王斌,孙春明,张森.不均衡数据集上文本分类的特征选择研究[J].计算机研究与发展,2007,44(z2):58-62. 被引量:20
  • 2Allan J, Papka R, Lavrenko V. On-line New Event Detection and Tracking[C]//Proceedings of SIGIR'98. Amherst, USA: [s. n.], 1998.
  • 3Allan J, Harding S, Fisher D, et al. Taking Topic Detection from Evaluation to Practice[C]//Proceedings of the 38th Hawaii International Conference on System Sciences. Big Island, Hawaii, USA: [s. n.], 2005.
  • 4Allan J. Topic Detection and Tracking: Event-based Information Retrieval[M]. Norvell, MA, USA: Kluwer Academic Publishers, 2002.
  • 5洪宇,张宇,刘挺,李生.话题检测与跟踪的评测及研究综述[J].中文信息学报,2007,21(6):71-87. 被引量:153
  • 6Allan J,Carbonell J,Doddington G,et al.Topic Detection and Tracking Pilot Study:Final Report[C] //Proceedings of DARPA Broadcast News Transcription and Understanding Workshop.[S.l.] :Lansdowne Press,1998.
  • 7Allan J,Papka R,Lavrenko V.On-line New Event Detection and Tracking[C] //Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Amherst,USA:University of Massachusetts,1998:37-45.
  • 8Seo Y W,Sycara K.Text Clustering for Topic Detection[D].Pittsburgh,USA:Carnegie Mellon University,2004.
  • 9Yi Xiaolin,Zhao Xiao,Ke Nan,et al.An Improved Single-pass Clustering Algorithm Internet-oriented Network Topic Detection[C] //Proceedings of the 4th International Conference on Intelligent Control and Information Processing.Beijing,China:[s.n.] ,2013:560-564.
  • 10Chen Feng,Du Juan,Qian Weining,et al.Topic Detection over Online Forum[C] //Proceedings of the 9th Web Information Systems and Applications Conference.Haikou,China:[s.n.] ,2012:235-240.

共引文献29

同被引文献17

引证文献3

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部