摘要
词向量是支撑自然语言处理的重要基础,为了解决目前的RECWE(Radical Enhanced Chinese Word Embedding)模型没有合理利用上下文词语贡献度不同、各词语中汉字及其偏旁部首和组件的贡献度不同的问题,提出了结合注意机制改进的RECWE模型,将模型的两个预测模型引入不同类型的注意力机制。实验结果表明,改进的RECWE模型与原模型相比,在相似度任务上,两份评测文件的成绩分别提高2.89%和1.04%;在类比任务上,三个主题的平均成绩提高2.07%,有效提高词向量的质量。
The word vector model is an important link in natural language processing.Aiming at the problems that the RECWE model does not reflect the different contribution of contextual words,and the different contributions of Chinese characters and their radicals and components in each word,this thesis introduces different types of attention mechanisms to the two prediction models of the RECWE model.An improved Radical Enhanced Chinese Word Embedding(RECWE)model combined with attention mechanism is proposed.The experimental results show that the performance of the two evaluation files increased by 2.89%and 1.04%respectively on the similarity task;the average performance of the three topics increased by 2.07%on the analog task,which is effective improve the quality of word vectors.
作者
高统超
张云华
Gao Tongchao;Zhang Yunhua(School of Information Science and Technology,Zhejiang Sci-Tech University,ZhejiangHangzhou 310018)
出处
《网络空间安全》
2020年第2期96-103,共8页
Cyberspace Security
作者简介
高统超(1994-),男,汉族,安徽六安人,浙江理工大学,硕士,主要研究方向和关注领域:智能信息处理。;张云华(1965-),男,汉族,江苏常州人,浙江大学,博士,教授,主要研究方向和关注领域:软件架构、软件工程、智能信息处理。