期刊文献+

基于音素后验概率和层次凝聚聚类算法的音素边界检测

Phoneme boundary detection based on phoneme posterior probabilities and hierarchical agglomerative clustering algorithm
在线阅读 下载PDF
导出
摘要 提出了一种基于音素后验概率和层次凝聚聚类算法的音素边界检测方法。该方法首先利用改进的TRAP结构提取语音信号的帧级音素后验概率;然后,运用层次凝聚聚类算法将提取的音素后验概率进行聚类分析;最后根据其全部的最小损失函数值获取阈值,并通过此阈值决定聚类数目和音素边界。实验证明:该方法具有较好的检测性能,且相对于梅尔倒谱参数(MFCC),音素后验概率更为适合音素边界的检测。 A method of phoneme boundary detection based on phoneme posterior probability and Hierarchical Agglomerative Clustering(HAC) is presented. According to this method, phoneme posterior probabilities should first of all be got by Temporal Pattern(TRAP), and then HAC algorithm is chosen to cluster the phoneme posterior probability. Finally, a reasonable threshold can be obtained by all loss function values, and the number of clusters and the phoneme boundaries can be determined by the threshold. The experimental results show that this method is efficient and bears a good detection performance; and the phoneme posterior probabilities are more suitable for phoneme boundary detection than the Mel-Scale Frequency Cepstral Coefficients(MFCC).
出处 《太赫兹科学与电子信息学报》 2014年第2期260-265,共6页 Journal of Terahertz Science and Electronic Information Technology
基金 国家自然科学基金资助项目(61175017)
关键词 音素边界检测 音素后验概率 层次凝聚聚类 phoneme boundary detection phoneme posterior probability hierarchical agglomerative clustering
  • 相关文献

参考文献3

二级参考文献26

  • 1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 2白亮,老松杨,陈剑赟,吴玲达.基于支持向量机的音频分类与分割[J].计算机科学,2005,32(4):87-90. 被引量:13
  • 3[1]Scheirer E,Slaney M.Construction and evaluation of a robust multifeature speech/music discriminator[C]// in Proc.ICASSP.1997:1331-1334.
  • 4[2]Lu lie,Jiang hao,Zhang hongjiang.A robust audio classification and segmentation method[C]// ACM Multimedia.2001:203-211.
  • 5[3]Wang W Q,Gao W.Automatic segmentation of news items based on video and audio features[C]// The second IEEE pacificrim conference on multimedia.2001:24-26.
  • 6[5]Wang W Q,Gao W.A Fast and Robust Speech/Music Discrimination Approach[J].ICICS-PCM,2003,3:1325-1329.
  • 7Dusan S, Rabiner L R. On integrating insights from human speeeh pereeption into automatic speech rec- ognitionl-C]//Conference on the International Speech Communication Association (InterSpeech). Lisbon: Interspeeeh Press, 2005 : 1233-1236.
  • 8Morris J, Fosler Lussier E. Combining phonetic at tributes using conditional random fields[C]/Proc An nu Conf Int Speech Commun Assoc, INTER SPEECH. UK: Dummy Pubid, 2006:597-600.
  • 9Scharenborg O, Wan V, Mirjam E. Unsupervised speech segmentation: an analysis of the hypothesized phone boundaries[J]. Journal of the Acoustical Soci- ety of America, 2010,127(2) :1084-1095.
  • 10Yu Qiao, Shimomura N, Minematsu N. Unsuper- vised optimal phoneme segmentation., objectives, al- gorithm and comparisons [C]//IEEE International Conference on Acoustics, Speech and Signal Process- ing. Las Vegas, USA: Es. n. ], 2008:3989-3992.

共引文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部