摘要
提出了一种基于复数帧段特征的语音情感识别方法,采用相继的复数帧组成的特征参数矢量作为语音情感识别GMM的输入,能有效地在语音情感识别GMM中引入帧间相关动态信息,同时为了改善复数帧段输入GMM的输出概率密度函数性能,在GMM的前端增加语音帧段参数压缩的主分量分析神经网络(PCANN)。语音情感识别实验证实了引入帧间相关动态信息方法的有效性,新方法在识别率上较状态输出独立GMM方法有一定程度的提升。
A method of speech emotion recognition is proposed based on complex frame segment feature.Through combining several successive frames as a segmental unit witch is treated as an input vector for Gaussian Mixture Model(GMM).The inter-frame correlation information is effectively introduced into the process of speech emotion recognition.Furthermore,principal components analysis neural nerwork(PCANN)is adopted before GMM for the purpose of frame parameter compression,to improve the performance of output probability density function.Corresponding experiments are performed and the results show that the recognition rate of the proposed method is improved to some extend comparing with the traditional status output independent GMM,thus the effectiveness of introducing dynamic inter-frame correlation information into the process of speech emotion recognition is validated.
作者
张霞
杨勇
赵力
ZHANG Xia;YANG Yong;ZHAO Li(School of Mechanical,Electrical and Information Engineering,Putian University,Putian Fujian 351100,China;School of Information Science and Engineering,Southeast University,Nanjing Jiangsu 210096,China)
出处
《电子器件》
CAS
北大核心
2022年第2期479-482,共4页
Chinese Journal of Electron Devices
基金
福建省中青年教师教育科研项目(JAT200535)
关键词
语音情感识别
高斯混合模型
主分量分析神经网络
复数帧段特征
speech emotion recognition
Gaussian mixture model
principal components analysis neural network
complex frame segment feature
作者简介
张霞(1983-),女,工学硕士,莆田学院讲师,研究方向为信号与信息处理、人工智能等,concise.zhang@gmail.com;杨勇(1981-),男,河北涉县人,工学博士,现为东南大学信息科学与工程学院博士后,副教授,研究方向为信号与信息处理,YongYang@cumt.edu.cn;赵力(1958-),男,东南大学信息科学与工程学院教授,博士生导师,研究方向为信号与信息处理等