摘要
通过采集与分析语音信号和心电信号,研究了相应的情感特征与融合算法.首先,通过噪声刺激和观看影视片段的方式分别诱发烦躁情感和喜悦情感,并采集了相应情感状态下的语音信号和心电信号.然后,提取韵律、音质特征和心率变异性特征分别作为语音信号和心电信号的情感特征.最后,利用加权融合和特征空间变换的方法分别对判决层和特征层进行融合,并比较了这2种融合算法在语音信号与心电信号融合情感识别中的性能.实验结果表明:在相同测试条件下,基于心电信号和基于语音信号的单模态情感分类器获得的平均识别率分别为71%和80%;通过特征层融合,多模态分类器的识别率则达到90%以上;特征层融合算法的平均识别率高于判决层融合算法.因此,依据语音信号、心电信号等不同来源的情感特征可以构建出可靠的情感识别系统.
Through collecting and analyzing speech signals and electrocardiography(ECG) signals,emotion features and fusion algorithms are studied.First,annoyance is induced by noise stimulation and happiness is induced by comedy movie clips.The corresponding speech signals and ECG signals are recorded.Then,prosodic features and voice quality features are adopted for speech emotional features,and heart rate variability(HRV) features are used for ECG emotional features.Finally,the decision level fusion and the feature level fusion are accomplished by the weighted fusion method and the feature transformation method,respectively.The performances of the two fusion methods in speech emotion and ECG emotion recognition are compared.The experimental results show that for the same testing set,the average recognition rates of the single modal classifier based on the ECG signals and the single modal classifier based on the speech signals reach 71% and 80%,respectively,while that of the multi-modal classifier with the feature level fusion of the speech signals and the ECG signals achieves above 90%.The average recognition rate of the feature level fusion algorithm is higher than that of the decision level fusion algorithm.The different signal channels such as speech signals and ECG signals show a promising improvement in building a reliable emotion recognition system.
出处
《东南大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2010年第5期895-900,共6页
Journal of Southeast University:Natural Science Edition
基金
国家自然科学基金资助项目(60472058
60975017)
江苏省自然科学基金资助项目(BK2008291)
关键词
情感识别
多模态
判决层融合
特征层融合
emotion recognition
multimodal
decision level fusion
feature level fusion
作者简介
黄程韦(1984-),男,博士生
赵力(联系人),男,博士,教授,博士生导师,zhaoli@seu.edu.cn.