期刊文献+

基于融合特征以及卷积神经网络的环境声音分类系统研究 被引量:21

Environment Sound Classification System Based on Hybrid Feature and Convolutional Neural Network
在线阅读 下载PDF
导出
摘要 环境声音识别系统主要基于深度神经网络以及种类繁多的听觉特征对环境声音进行分类识别。分析基于深度神经网络的环境分类任务中,哪种听觉特征更适合环境声音识别系统十分必要。选择了基于2个广泛使用的滤波器:梅尔和Gammatone滤波器组提取的3种声音特征。随后,提出了一个MFCC和GFCC融合的特征MGCC。最后采用文中提出的深度卷积神经网络来验证哪种特征更适合于环境声音的分类识别。实验结果表明,在基于神经网络的环境声音分类系统中,信号处理特征比频谱图特征的效果好,其中,MGCC特征具有比其他特征更好的性能。最后,用文中提出的MCC-CNN模型与其他环境声音分类模型在UrbanSound 8K数据集上进行了对比。实验结果表明,所提模型分类精度最好。 At present,the environment sound recognition system mainly identifies environment sounds with deep neural networks and a wide variety of auditory features.Therefore,it is necessary to analyze which auditory features are more suitable for deep neural networks based ESCR systems.In this paper,we chose three sound features which based on two widely used filters:the Mel and Gammatone filter banks.Subsequently,the hybrid feature MGCC is presented.Finally,a deep convolutional neural network is proposed to verify which features are more suitable for environment sound classification and recognition tasks.The experimental results show that the signal processing features are better than the spectrogram features in the deep neural network based environmental sound recognition system.Among all the acoustic features,the MGCC feature achieves the best performance than other features.Finally,the MGCC-CNN model proposed in this paper is compared with the state-of-the-art environmental sound classification models on the UrbanSound 8K dataset.The results show that the proposed model has the best classification accuracy.
作者 张科 苏雨 王靖宇 王霰宇 张彦华 ZHANG Ke;SU Yu;WANG Jingyu;WANG Sanyu;ZHANG Yanhua(National Key Laboratory of Aerospace Flight Dynamics, Xi′an 710072, China;School of Astronautics, Northwestern Polytecnical University, Xi′an 710072, China;Signals, Images, and Intelligent Systems Laboratory(LISSI/EA 3956), University Paris-Est Creteil, Senart-FB Institute of Technology, 36-37 rue Charpak, 77127 Lieusaint, France)
出处 《西北工业大学学报》 EI CAS CSCD 北大核心 2020年第1期162-169,共8页 Journal of Northwestern Polytechnical University
基金 国家自然科学基金重大项目(51890884) 国家自然科学基金(61976179,61502391)资助
关键词 环境声音 特征融合 声音分类 卷积神经网络 environment sound hybrid feature sound classification convolutional neural network filter
作者简介 张科(1965—),西北工业大学教授,主要从事导航、制导与控制研究。
  • 相关文献

同被引文献164

引证文献21

二级引证文献57

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部