摘要
环境声音识别系统主要基于深度神经网络以及种类繁多的听觉特征对环境声音进行分类识别。分析基于深度神经网络的环境分类任务中,哪种听觉特征更适合环境声音识别系统十分必要。选择了基于2个广泛使用的滤波器:梅尔和Gammatone滤波器组提取的3种声音特征。随后,提出了一个MFCC和GFCC融合的特征MGCC。最后采用文中提出的深度卷积神经网络来验证哪种特征更适合于环境声音的分类识别。实验结果表明,在基于神经网络的环境声音分类系统中,信号处理特征比频谱图特征的效果好,其中,MGCC特征具有比其他特征更好的性能。最后,用文中提出的MCC-CNN模型与其他环境声音分类模型在UrbanSound 8K数据集上进行了对比。实验结果表明,所提模型分类精度最好。
At present,the environment sound recognition system mainly identifies environment sounds with deep neural networks and a wide variety of auditory features.Therefore,it is necessary to analyze which auditory features are more suitable for deep neural networks based ESCR systems.In this paper,we chose three sound features which based on two widely used filters:the Mel and Gammatone filter banks.Subsequently,the hybrid feature MGCC is presented.Finally,a deep convolutional neural network is proposed to verify which features are more suitable for environment sound classification and recognition tasks.The experimental results show that the signal processing features are better than the spectrogram features in the deep neural network based environmental sound recognition system.Among all the acoustic features,the MGCC feature achieves the best performance than other features.Finally,the MGCC-CNN model proposed in this paper is compared with the state-of-the-art environmental sound classification models on the UrbanSound 8K dataset.The results show that the proposed model has the best classification accuracy.
作者
张科
苏雨
王靖宇
王霰宇
张彦华
ZHANG Ke;SU Yu;WANG Jingyu;WANG Sanyu;ZHANG Yanhua(National Key Laboratory of Aerospace Flight Dynamics, Xi′an 710072, China;School of Astronautics, Northwestern Polytecnical University, Xi′an 710072, China;Signals, Images, and Intelligent Systems Laboratory(LISSI/EA 3956), University Paris-Est Creteil, Senart-FB Institute of Technology, 36-37 rue Charpak, 77127 Lieusaint, France)
出处
《西北工业大学学报》
EI
CAS
CSCD
北大核心
2020年第1期162-169,共8页
Journal of Northwestern Polytechnical University
基金
国家自然科学基金重大项目(51890884)
国家自然科学基金(61976179,61502391)资助
关键词
环境声音
特征融合
声音分类
卷积神经网络
environment sound
hybrid feature
sound classification
convolutional neural network
filter
作者简介
张科(1965—),西北工业大学教授,主要从事导航、制导与控制研究。