期刊文献+

语音识别中听觉特征的噪声鲁棒性分析 被引量:8

Analysis of noise robustness of auditory features in speech recognition
原文传递
导出
摘要 自动语音识别系统在噪声环境下的性能通常会显著下降,这成为制约语音识别技术广泛应用的一个重大障碍。该文在他人的基于Gammatone的听觉特征(GFCC特征)研究基础上,进一步对GFCC与基于Mel频率的倒谱系数(MFCC)在不同噪声环境下的性能表现进行分析研究。选择5种人工和自然噪声进行比较试验:白噪声、粉红噪声、褐色噪声、背景说话人噪声、汽车噪声。通过混合不同类型和不同强度的噪声,系统地研究了基于听觉特性的GFCC特征的特性和抗噪能力;特别地,用不同频段的正弦波噪声与纯净语音混合,分析了GFCC和MFCC在各个频带上的噪声鲁棒性。研究发现,与传统的MFCC相比,GFCC对低频噪声具有更高的鲁棒性,而对中高频噪声相对敏感。由于人类发音通常在较低频率(300~700Hz),这一特性使得GFCC在语音识别任务中具有良好的抗噪能力。实验结果表明,GFCC在多种常见噪声环境下都取得了比MFCC更好的识别效果,特别是在低信噪比的情况下表现出更大的优势。 A particular difficulty of automatic speech .recognition in real applications involves significant performance degradation in noisy environment. Based on the research on gammatone-based auditory features (GFCCs) proposed by other researchers, an additional comparative study on the GFCC and the MFCC was presented for various noise conditions. Particularly, the behavior of GFCC/MFCC features with noise in different frequency bands was analyzed by mixing the test speech with sine noises to show that the GFCC is more robust against low-frequency noises than the MFCCwhile more sensitive to noises at middle and high frequencies. This property is desirable for speech recognition since most of the information of human speech resides in the low frequency band of 300--700 Hz. Experimental results demonstrate that the GFCC exhibits significant advantages over the MFCC for various noise conditions, especially when the SNR is low.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2013年第8期1082-1086,共5页 Journal of Tsinghua University(Science and Technology)
关键词 语音识别 Gammatone滤波器 基于Gammatone 的听觉特征(GFCC) 鲁棒性 speech recognition gammatone filters gammatone- based auditory feature (GFCC) robust
作者简介 李银国(1955-),男(汉),湖北,教授。E-mail:liyg@cqupt.edu.cn
  • 相关文献

同被引文献57

  • 1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 2刘鹏,王作英.多模式语音端点检测[J].清华大学学报(自然科学版),2005,45(7):896-899. 被引量:6
  • 3李晔,张仁智,崔慧娟,唐昆.低信噪比下基于谱熵的语音端点检测算法[J].清华大学学报(自然科学版),2005,45(10):1397-1400. 被引量:37
  • 4张明新,倪宏,张东滨,陈国平.基于PMC方法的鲁棒声学模型研究[J].中国科学院研究生院学报,2006,23(5):660-664. 被引量:1
  • 5李朝晖,迟惠生.听觉外周计算模型研究进展[J].声学学报,2006,31(5):449-465. 被引量:22
  • 6ION V, HAEB-UMBACH R. A novel uncertainty decoding rule with applications to transmission error robust speech recognition[ J]. IEEE Transac- tions on Audio Speech and Language Processing, 2008, 16(5) : 1047 -1060.
  • 7QI Y Y , HUNT B R. Voiced-unvoiced-silence classification of speech using hybrid features and a network classifier[ J]. IEEE Transactions on Speech and Audio Processing, I993, 1 (2) : 250 -255.
  • 8RABINER L R, SAMBUR M R. An algorithm for determining the endpoints of isolated utterances [ J ]. Bell System Technical Journal, 1975,54 (2) : 297 -315.
  • 9KYRIAKIDES A, P1TRIS C, FINK A, et al. Isolated word endpoint detection using time-frequency variance kernels[ J]. IEEE Transactions on Signals, Systems and Computers, 2011:1041 -1045.
  • 10Qi Jun, et al. Auditory features based on Gammatone filters for ro- bust speech recognition[C]. 2013 IEEE International Symposium on Circuits and Systems (ISCAS). 2013:305 -308.

引证文献8

二级引证文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部