摘要
根据蒙古语的一些特点,为基于短语的汉蒙统计机器翻译提出了一种适合于汉蒙统计机器翻译的调序模型,并给出了相应的训练和解码算法以及初步实验的结果。汉蒙双语语料库规模很小,数据稀疏问题严重,而在汉蒙翻译中,词序变化又非常明显,在汉英等机器翻译中使用的调序方法难于应用到汉蒙统计机器翻译中。通过对汉蒙翻译过程中词语顺序变化的正态分布假设,建立了一种概率调序模型。实验表明,这种概率调序模型好于 Moses系统中采用的调序方法。
Based on the phrase-based statistical Chinese-Mongolian machine translation, an ordering model is put forward according to the Mongolian features, together with the corresponding drills, the decoding algorithm as well as the results of the primary experiments. Currently, the Chinese-Mongolian bilingual corpus is on a relatively small scale and its data are seriously sparse. In addition, the word order changes are dramatic and prevalent in Chinese-Mongolian translations. Consequently, the reordering method used in Chinese-English translation can not be optimally applied to the Chinese-Mongolian translation. By the assumption of the normal distribution of word-order changes after the analyses of these changes in Chinese-Mongolian translations, a probabilistic reordering model is established in the paper. According to the experimental results, the probabilistic model is superior to the ordering method in the Moses.
出处
《高技术通讯》
EI
CAS
CSCD
北大核心
2009年第5期475-479,共5页
Chinese High Technology Letters
基金
973计划(2007CB316503)
内蒙古自然基金(200607010805)资助项目
关键词
机器翻译
统计方法
蒙古语
调序
概率
machine translation, statistical method, Mongolian, reorder, probability
作者简介
男,1972年生,博士,副教授;研究方向:自然语言处理;联系人,E-mail:eshhx@imu.edu.cn