期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
语料资源缺乏的连续语音识别方法的研究 被引量:9
1
作者 伊·达瓦 匂坂芳典 中村哲 《自动化学报》 EI CSCD 北大核心 2010年第4期550-557,共8页
由于少数民族语言有其本身的特点,不能简单地套用现有的连续语音识别的方法.本文以蒙古语为例,研讨了声学和语言模型的建立,并在日本国际电气通信基础技术研究所的连续语音识别器上实现了蒙古语的语音识别系统.本文侧重于语言模型的建立... 由于少数民族语言有其本身的特点,不能简单地套用现有的连续语音识别的方法.本文以蒙古语为例,研讨了声学和语言模型的建立,并在日本国际电气通信基础技术研究所的连续语音识别器上实现了蒙古语的语音识别系统.本文侧重于语言模型的建立,基于蒙古语黏着性语言特点,提出用相似词聚类方法建立多类N-gram模型.实验结果显示,应用我们提出的语言模型,识别精度比用传统的词的N-gram识别法提高了5.5%. 展开更多
关键词 蒙古语 黏着语言 相似词分类 连续语语音识别 多类语言模型
在线阅读 下载PDF
Vari-gram language model based on word clustering
2
作者 袁里驰 《Journal of Central South University》 SCIE EI CAS 2012年第4期1057-1062,共6页
Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with g... Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model. 展开更多
关键词 word similarity word clustering statistical language model vari-gram language model
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部