摘要
当前在计算机视觉领域,场景识别尽管取得了较大进展,但其对于计算机视觉而言,仍然是一个极具挑战的问题.此前的场景识别方法,有些需要预先手动地对训练图像进行语义标注,并且大部分场景识别方法均基于"特征袋"模型,需要对提取的大量特征进行聚类,计算量和内存消耗均很大,且初始聚类中心及聚类数目的选择对识别效果有较大影响.为此本文提出一种不基于"特征袋"模型的无监督场景识别方法.先通过亚采样构建多幅不同分辨率的图像,在多级分辨率图像上,分别提取结构和纹理特征,用本文提出的梯度方向直方图描述方法表示图像的结构特征,用Gabor滤波器组和Schmid滤波集对图像的滤波响应表示图像的纹理特征,并将结构和纹理特征作为相互独立的两个特征通道,最后综合这两个特征通道,通过SVM分类,实现对场景的自动识别.分别在Oliva,Li Fei-Fei和Lazebnik等的8类、13类和15类场景图像库上进行测试实验,实验结果表明,梯度方向直方图描述方法比经典的SIFT描述方法,有着更好的场景识别性能;综合结构和纹理特征的场景识别方法,在通用的三个场景图像库上取得了很好的识别效果.
Automatic recognition of the contents of a scene is an important issue in the field of computer vision. Although considerable progress has been made, the complexity of scenes remains an important challenge to computer vision research. Most previous approaches for scene recognition are based on the so-called "bag of visual words" model, which uses clustering methods to quantize numerous local region descriptors into a codebook. The size of the codebook and the selection of initial clustering centers greatly affect the performance. ~rthermore, the large size of the codebook leads to high computational costs and large memory consumption. To overcome these weaknesses, we present an unsupervised natural scene recognition approach that is not based on the "bag of visual words" model. This approach constructs multiple images of different resolutions and extracts structural and textural features from these images. The structural features are represented by weighted histograms of the gradient orientation descriptor, which is presented in this paper, and the textural features are represented by filter responses of Gabor filters and a Schmid set. We regard the structural and textural features as two independent feature channels, and combine them to realize automatic categorization of scenes using a support vector machine. We then evaluated our approach using three commonly used datasets with various scene categories. Our experiments demonstrate that the weighted histograms of the gradient orientation descriptor outperform the classical scale invariant feature transform descriptor in natural-scene recognition, and our approach achieves good performance with respect to current state-of-the-art methods.
出处
《中国科学:信息科学》
CSCD
2012年第6期687-702,共16页
Scientia Sinica(Informationis)
基金
国家自然科学基金(批准号:60835005
60736018)
国家重点基础研究发展计划(批准号:2007CB311001)
湖南省高校科技创新团队资助项目
关键词
场景识别
结构特征
纹理特征
特征融合
梯度方向直方图
scene recognition, structural feature, textural feature, feature combination, weighted histograms of gradient orientation descriptor
作者简介
通信作者.E—mail:li—zhou@yahoo.cn, ZHOU Li was born in 1982. He received the B.S. and M.S. degrees from Dalian Navy Academy in 2004 and 2006, respectively. He is currently working toward the doctoral degree in National University of Defense Technology. His research interests include computer/biological vision, visual navigation, and machine learning.
dwhu@nudt.edu.cn,HU DeWen was born in 1963. He received the B.S. and M.S. degrees from Xi'an Jiaotong University in 1983 and 1986, respectively. From 1986, he was with National University of Defense Technology. From October 1995 to October 1996, he was a Visiting Scholar with the University of Sheffield, UK. He got his Ph.D. degree from National University of Defense Technology in 1999.He was promoted Professor in 1996. His research interests include image processing, system identification and control, neural networks, and cognitive science. He is an action editor of neural networks.
narcz@163.comZHOU ZongTan was born in 1969. He received the B.S., M.S. and Ph.D. degrees from National University of Defense Technology in 1990, 1994 and 1998, respectively. From February 2010 to February 2011, He was a Visiting Scholar with the Eberhard Karls Uni- versitt Tfibingen. Professor in 2007 He was promoted His research interests include image/signal processing, comouter/biologica.l vision, neural net-works, cognitive neuroscience and brain-computer interface.