期刊文献+

基于BN-SGMM-HMM模型的低资源语音识别系统 被引量:11

Low-resource speech recognition system based on BN-SGMM-HMM model
在线阅读 下载PDF
导出
摘要 针对语音识别系统在低资源条件下,采用传统的高斯混合-隐马尔可夫声学模型(GMM-HMM)会带来识别精度低、参数规模过大等问题,文章提出基于BN-SGMM-HMM的声学模型来解决GMM-HMM模型的不足。该模型在声学特征方面,通过基于瓶颈(bottleneck,BN)层的神经网络来进行提取,从而提高声学特征的可区分性与鲁棒性,同时在训练过程中引入Dropout策略来防止过拟合问题;在声学模型方面,采用子空间高斯混合模型(subspace Gaussian mixture model,SGMM),使得模型参数规模显著降低56.5%。同时,这两方面的改进也提升了低资源语音识别系统的识别率,TIMIT语音数据库实验表明,采用该模型,与GMM-HMM模型相比提高8.0%,与BN-GMM-HMM模型相比提高3.6%。这些优点对该模型在低功耗需求的硬件平台上实现部署有极大的帮助。 Upon a low-resource database condition,traditional acoustic GMM-HMM model cannot achieve a satisfying recognition rate and has large parameter scale.In order to solve these problems,a speech recognition BN-SGMM-HMM model is proposed in this paper.In the acoustic feature aspect,a DNN-based bottleneck(BN)feature is extracted which improves the discriminability and robustness capability of the system;meanwhile,the Dropout strategy is employed to prevent over-fitting problem during the training process.In the acoustic model aspect,the subspace Gaussian mixture model(SGMM)is adopted to decrease the parameter scale.It has a significant 56.5%reduction compared with the GMM-HMM model.At the same time,these two aspects also help to improve the detection rate of low-resource speech recognition system.Experiments on the TIMIT database indicate that the accuracy of the proposed BN-SGMM-HMM model is 8.0%higher than that of GMM-HMM model,and 3.6%higher than that of BN-GMM-HMM model.This proposed model is valuable for the future implementation on low-power hardware platform.
作者 雷杰 赵宏亮 艾宁智 邹万冰 詹毅 LEI Jie;ZHAO Hongliang;AI Ningzhi;ZOU Wanbing;ZHAN Yi(School of Physics, Liaoning University, Shenyang 110036, China;College of Electronic Science and Engineering, Jilin University, Changchun 130012, China;Institutes of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China)
出处 《合肥工业大学学报(自然科学版)》 CAS 北大核心 2021年第12期1627-1632,共6页 Journal of Hefei University of Technology:Natural Science
基金 国家重点研发计划资助项目(2019YFB2204601)。
关键词 语音识别 瓶颈特征 子空间高斯混合模型(SGMM) Dropout策略 低资源 speech recognition bottleneck(BN)feature subspace Gaussian mixture model(SGMM) Dropout strategy low resource
作者简介 雷杰(1995—),男,山东淄博人,辽宁大学硕士生;通信作者:詹毅(1973—),男,浙江开化人,博士,中国科学院正高级工程师,E-mail:yizhan@ime.ac.cn.
  • 相关文献

参考文献2

二级参考文献11

共引文献14

同被引文献130

引证文献11

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部