期刊文献+

基于动态匹配词格检索的关键词检测 被引量:2

Keyword Detection Based on Dynamic Match Lattice Spotting
在线阅读 下载PDF
导出
摘要 对生活中涌现的海量语音数据需要进行快速而准确的检索.提出一种基于动态匹配词格检索的关键词检测方法,应用TRAP特征和多层感知器创建更为精准的音素Lattice.在索引阶段执行一个改进的维特比算法遍历Lattice来创建一个固定长度的音素序列数据库,在检索阶段应用最小编辑距离作为置信度来实现关键词的检出.实验结果表明,该方法相比应用MFCC和PLP特征的基线系统具有一定的优势,召回率可提升5%左右. The large amount of speech data requires techniques for rapid and accurate search. This paper proposes a keyword spotting method based on dynamic match Lattice spotting (DMLS). It generates more ac- curate phone Lattice with TRAP features and multilayer percep.tron, and performs a modified Viterbi traversal to compile a database of fixed-length phone sequences in speech indexing. In the searching stage, a minimum edit distance is used as the confidence score to implement the keyword spotting. Tests show that the proposed method is superior to baseline systems with MFCC and PLP features with the recall rate improved by about 5%.
出处 《应用科学学报》 CAS CSCD 北大核心 2014年第2期149-155,共7页 Journal of Applied Sciences
基金 国家自然科学基金(No.61175017) 全军军事学研究课题基金(No.2010JY0256-143)资助
关键词 关键词检测 动态匹配词格检索 TRAP特征 最小编辑距离 keyword spotting, dynamic match Lattice spotting, TRAP feature, minimum edit distance
作者简介 通信作者:张连海,博士,副教授,研究方向:语音信号处理、模式识别,E-maihlianhaiz@sina.com
  • 相关文献

参考文献16

  • 1SUN Chengli. A study of speech keyword recognition technology [D]. Beijing: Beijing University of Posts and Telecommunications, 2008: 1-2. (in Chinese).
  • 2NG K, ZUE V W. Subword-based approaches for spo- ken document retrieval [J]. Speech Communication, 2000, 32: 157-186.
  • 3AKBACAK M, BURGET L, WANG W, VAN H J. Rich system combination for keyword spotting in noisy and acoustically heterogeneous audio streams [C]//IEEE International Conference on Acoustic, Speech and Signal Processing, 2013: 8267-8271.
  • 4THAMBIRATNAM K, SRIDHARAN S. Rapid yet accu- rate speech indexing using dynamic match lattice spotting [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(1): 346-357.
  • 5HAN C, KANC S, LEE C. Phone mismatch penalty matrices for two-stage keyword spotting via multi- pass phone recognizer [C]//The 11th Annual Con- ference of the International Speech Communication Association, 2010: 202-205.
  • 6RAJABZADEH M, TABIBIAN S, AKBARI A. Improved dynamic match phone lattice search using viterbi scores and jaro winkler distance for keyword spottingsystem [C]//International Symposium on Artificial Intelligence and Signal Processing, 2012: 423-427.
  • 7李文昕,屈丹,李弼程,王炳锡.语音关键词检测系统中基于时长和边界信息的置信度[J].应用科学学报,2012,30(6):588-594. 被引量:2
  • 8HERMANSKY H, SHARMA" S. TRAPs-ciassifiers of temporal patterns [C]//International Conference on Spoken Language Processing, 1998:1003-1006.
  • 9SHARMA S, ELLIS D, KAJAREKAR S, JAIN P, HERMANSKY H. Feature extraction using non-linear transformation for robust speech recognition on the aurora database [C]//IEEE International Conference on Acoustic, Speech and Signal Processing, 2000: 1117-1120.
  • 10SCHWARZ P. Phoneme recognition based on long tem- poral context [D]. Brno: Brno University of Technol- ogy, 2008: 7-40.

二级参考文献18

  • 1国玉晶,刘刚,刘健,郭军.基于环境特征的语音识别置信度研究[J].清华大学学报(自然科学版),2009(S1):1388-1392. 被引量:8
  • 2Szoke I,Schwarz P, Matejka P. Phoneme basedacoustics keyword spotting in informal continuousspeech [C]//Proceedings of Radioelektronika, 2005:302-305.
  • 3Veryri D, Shafran I, Stolcke A. The SRI/OGI2006 spoken term detection system [C]//Proceedingsof Interspeech, 2007: 2393-2396.
  • 4Siohan O, Ramabhadran B, Mamou J. The IBM2006 spoken term detection system [CJ//NIST Spo-ken Term Detection Evaluation workshop, 2006.
  • 5Jiang Hui. Confidence measures for speech recog-nition: a survey [J]. Speech Communication, 2005:455-470.
  • 6Leung K Y, Siu M.rticulatory-feature-based con-fidence measures [J]. Speech Communication, 2005:1-21.
  • 7Jiang Hui. dynamic in-search data selectionmethod with its applications to acoustic modelingand utterance verification [J]. IEEE Transactionson Audio, Speech, and Language Processing, 2005,13(5): 945-955.
  • 8Pan Y C,Chang H L Chen B,Lee L S. Subword-based position specific posterior lattices (S-PSPL) forindexing speech information [C] //Proceedings of In-terspeech, 2007: 318-321.
  • 9Gao Jie, Zhao Qingwei, Xu Ran, Yan Yonghong.Improved Lattice-based confidence measure forspeech recognition via a lattice cut off procedure [J] .IEEE Computer Science, 2009: 473-476.
  • 10Wang D, Tejedor J,Frankel J. Posterior-basedconfidence measures for spoken term detection [C]//Proceedings of ICASSP, 2009: 4889-4892.

共引文献1

同被引文献8

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部