基于增量预训练和对抗训练的文本匹配模型

Text Matching Model Based on Incremental Pre-training and Adversarial Training

在线阅读下载PDF

导出

摘要文本匹配是自然语言理解的关键技术之一,其任务是判断两段文本的相似程度.近年来随着预训练模型的发展,基于预训练语言模型的文本匹配技术得到了广泛的应用.然而,这类文本匹配模型仍然面临着在某一特定领域泛化能力不佳、语义匹配时鲁棒性较弱这两个挑战.为此,本文提出了基于低频词的增量预训练及对抗训练方法来提高文本匹配模型的效果.本文通过针对领域内低频词的增量预训练,帮助模型向目标领域迁移,增强模型的泛化能力;同时本文尝试多种针对低频词的对抗训练方法,提升模型对词级别扰动的适应能力,提高模型的鲁棒性.本文在LCQMC数据集和房产领域文本匹配数据集上的实验结果表明,增量预训练、对抗训练以及这两种方式的结合使用均可明显改善文本匹配结果. Text matching is one of the key techniques in natural language understanding,and its task is to determine the similarity of two texts.In recent years,with the development of pre-trained models,text-matching techniques based on pre-trained language models have been widely used.However,these text matching models still face the challenges of poor generalization ability in a particular domain and weak robustness in semantic matching.Therefore,this study proposes an incremental pre-training and adversarial training method for low-frequency words to improve the effect of the text matching model.The incremental pre-training of low-frequency words in the source domain helps the model migrate to the target domain and enhances the generalization ability of the model.Additionally,various adversarial training methods for low-frequency words are tried to improve the model’s adaptability to word-level perturbations and the robustness of the model.The experimental results on the LCQMC dataset and the text-matching dataset in the real estate domain indicate that incremental pre-training,adversarial training,and the combination of the two approaches can significantly improve the text matching results.

作者司志博文李少博单丽莉孙承杰刘秉权 SI Zhi-Bo-Wen;LI Shao-Bo;SHAN Li-Li;SUN Cheng-Jie;LIU Bing-Quan(State Key Laboratory of Communication Content Cognition,People’s Daily Online,Beijing 100733,China;Faculty of Computing,Harbin Institute of Technology,Harbin 150001,China)

机构地区人民网传播内容认知国家重点实验室哈尔滨工业大学计算学部

出处《计算机系统应用》 2022年第11期349-357,共9页 Computer Systems & Applications

基金国家自然科学基金(62176074)

关键词文本匹配预训练模型增量预训练对抗训练低频词深度学习自然语言处理 text matching pre-trained model incremental pre-training adversarial training low-frequency word deep learning natural language processing(NLP)

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

作者简介通信作者:刘秉权,E-mail:liubq@hit.edu.cn