摘要
研究基于混合特征的语音情感识别问题。为了避免梅尔频率倒谱系数(MFCC)滤波器组的高频信号存在泄漏的局限性,提出一种基于Gammatone滤波器的倒谱系数(GFCC)特征与韵律特征、音质特征混合的情感识别方法。Gammatone滤波器谱峰比MFCC的三角滤波器平缓,能够解决三角滤波器能量泄露的问题,因此GFCC在复杂环境中更能表现出良好的抗噪能力。该方法将GFCC与共振峰、基音频率、短时能量、浊音帧差分基音特征进行融合,针对EMO-DB语音情感数据库的200条和自制语料库的1120条语句,采用K最近邻分类器(KNN)模型作为识别机识别语音情感信息。通过传统混合特征和改进混合特征的对比分析,实验结果表明,在噪声环境中新的混合特征参数具有更高的识别率。
Speech emotion recognition based on mixed features is researched.In order to avoid the limitation of leakage of high frequency signals in Mel Frequency Cepstrum Coefficient(MFCC)filter banks,an emotion recognition method is proposed,which is based on Gammatone filter cepstrum coefficient(GFCC)features and prosodic features,voice quality features mixed emotion recognition method.The spectral peak of Gammatone filter is flatter than that of triangular filter,which can solve the problem of energy leakage of triangular filter,so GFCC can show better anti-noise ability in complex environment.The method fuses GFCC with format,pitch frequency,short time energy and voiced frame difference pitch characteristics,based on 200 sentences of EMO-DB speech emotion database and 1120 sentences of self-made corpus,the speech emotion information is recognized by using K nearest neighbor(KNN)classifier model.Through the comparing and analyzing of traditional and improved hybrid features,the experimental results show that the new hybrid feature parameters have higher recognition rate in noisy environment.
作者
余琳
姜囡
YU Lin;JIANG Nan(Criminal Investigation Police University of China,Shenyang 110854,China)
出处
《光电技术应用》
2020年第3期50-54,58,共6页
Electro-Optic Technology Application
基金
科技部国家重点研发专项项目(2017YFC0821005)
中央高校基本科研业务费专项资金资助(3242019010,3242019011,3242019012)
辽宁省自然科学基金项目(2019-ZD-0168)
中国刑警警察学院教研项目(2018QNZX19).
关键词
GFCC
情感识别
融合特征
K邻近分类算法
Gammatone filter cepstrum coefficient(GFCC)
emotional recognition
fusion feature
k-neighbor classification
作者简介
余琳(1998-),女,湖南长沙人,硕士研究生,主要研究方向为视听资料检验技术.E-mail:651075397@qq.com;姜囡(1979-),女,山东武城人,博士后,副教授,硕士研究生导师,主要研究方向为公安视听技术及模式识别.E-mail:zgxj_jiangnan@126.com。