期刊文献+

Video Concept Detection Based on Multiple Features and Classifiers Fusion 被引量:1

Video Concept Detection Based on Multiple Features and Classifiers Fusion
在线阅读 下载PDF
导出
摘要 The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the problem of semantic gap that low level features extracted by computers always fail to coincide with high-level concepts interpreted by humans. In this paper, we present a generic scheme for the detection video semantic concepts based on multiple visual features machine learning. Various global and local low-level visual features are systelrtically investigated, and kernelbased learning method equips the concept detection system to explore the potential of these features. Then we combine the different features and sub-systen on both classifier-level and kernel-level fusion that contribute to a more robust system Our proposed system is tested on the TRECVID dataset. The resulted Mean Average Precision (MAP) score is rmch better than the benchmark perforrmnce, which proves that our concepts detection engine develops a generic model and perforrrs well on both object and scene type concepts. The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the problem of semantic gap that low level features extracted by computers always fail to coincide with high-level concepts interpreted by humans. In this paper, we present a generic scheme for the detection video semantic concepts based on multiple visual features machine learning. Various global and local low-level visual features are systematically investigated, and kernel-based learning method equips the concept detection system to explore the potential of these features. Then we combine the different features and sub-systems on both classifier-level and kernel-level fusion that contribute to a more robust system. Our proposed system is tested on the TRECVID dataset. The resulted Mean Average Precision (MAP) score is much better than the benchmark performance, which proves that our concepts detection engine develops a generic model and performs well on both object and scene type concepts.
出处 《China Communications》 SCIE CSCD 2012年第8期105-121,共17页 中国通信(英文版)
基金 Acknowledgements This paper was supported by the coUabomtive Research Project SEV under Cant No. 01100474 between Beijing University of Posts and Telecorrrcnications and France Telecom R&D Beijing the National Natural Science Foundation of China under Cant No. 90920001 the Caduate Innovation Fund of SICE, BUPT, 2011.
关键词 concept detection visual feature extraction kemel-based learning classifier fusion 分类器融合 检测系统 视频文件 特征和 多媒体内容 语义鸿沟 视频分析 语义概念
作者简介 Dong Yuan, associate professor at Beijing University of Posts and Telecommunications, China, also a senior research consultant at the France Telecorn Research & Development Beijing in multimedia indexing research. He received his Ph.D. degree in Shanghai Jiao Tong University and worked as a postdoctoral research staff at the Engineering Department, Cambridge University, UK. He is now working on European speech recognition project-CORtEX. His cunent research interests include senntic video indexing, video copy detection, and multimedia content search. Email: yuandong@bupt.edu.cnZhang Jiwei postgraduate student from Pattern Recognition Lab, Beijng University of Posts and Telecornunications, China. His current research interests include visual concept detection and sports categorization. Email: buptjiwei@gmail.comZhao Nan, postgraduate student from Pattern Recognition Lab, Beijing University of Posts and Telecorrnunications, China. Her current research inter- ests include visual concept detection and sports categorization. Ernail: zhao.nan07@gmail.comChang Xiaofu, researcher of Multimedia Analysis and Retrieval, France Telecom Research & Development-Beijing, China. H/s research interests include image/video search, object recognition, and data rrming. Email: xiaofu.chang@orange.comLiu Wei, researcher of Multimedia Analysis andRetrieval, France Telecom Research & Development-Beijing, China. His current research interests include video and image copy detection, and face detection. Email: wei.liu@omnge.com
  • 相关文献

参考文献49

  • 1HOU Xiaodi, ZHANG Liqing. Saliency Detection: A Spectral Residual Approach[C]// Proceedings of IEEE Conference on Computer \%ion and Pattern Recognition: June 17-22,2007, Minneapolis, MN. IEEE Press, 2007: 1-8.
  • 2SMEULDERS A, WORRING M, SANTINI S, et al. Content-Based Image Retrieval at the End of the Early Years [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(12): 1349-1380.
  • 3UENHART R, KUHMUNCH C, EFFELSBERG W. On the Detection and Recognition of Television Commercials [C]// Proceedings of 1KHH Conference on Multimedia Conputing and Systems: June 3-6,1997, Ottawa, Ont.. IEEE Press, 1997: 509-516.
  • 4ZHANG Hongjiang, TAN S, SMOIiAR S, et al Automatic Parsing and Indexing of News Video [J], Multimedia Systems, 1995, 2(6): 256-266.
  • 5RUI Yong, GUPTA A, ACERO A. Automatically Extracting Highlights for TV Baseball Programs [C]// Proceedings of the 8th ACM International Conference on Multimedia: October 30-Noven4)er 03,2000, Los Angeles, CA, USA. ACM Press, 2000: 105-115.
  • 6VIOLA P, JONES M. Rapid Object Detection Using a Bossed Cascade of Simple Featuress fC]// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2011, 1:1-511-I-518.
  • 7ITTI L, KOCH C,NIEBUR E A Model of Saliency-Based Visual Attention for Rapid Scene Analysis [J]. I 匕fcife Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259.
  • 8HANJAIiC A, XU Liqun. Affective Video Content Representation and Modelling[J]. IEEE Transactions on Multimedia, 2005, 7(1): 143-154.
  • 9NAPHIDE H,HUANG T. A Probabilistic Framework for Semantic Video Indexing, filtering, and Retrieval [J]. IEEE Transactions on Multimedia, 2001,3(1): 141-151.
  • 10SNOEK G, WORRING M, GEUSEBROEK J, et al. The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006,28(10): 1678-1689.

同被引文献5

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部