Both analyzing a large amount of space weather observed data and alleviating personal experience bias are significant challenges in generating artificial space weather forecast products.With the use of natural languag...Both analyzing a large amount of space weather observed data and alleviating personal experience bias are significant challenges in generating artificial space weather forecast products.With the use of natural language generation methods based on the sequence-to-sequence model,space weather forecast texts can be automatically generated.To conduct our generation tasks at a fine-grained level,a taxonomy of space weather phenomena based on descriptions is presented.Then,our MDH(Multi-Domain Hybrid)model is proposed for generating space weather summaries in two stages.This model is composed of three sequence-to-sequence-based deep neural network sub-models(one Bidirectional Auto-Regressive Transformers pre-trained model and two Transformer models).Then,to evaluate how well MDH performs,quality evaluation metrics based on two prevalent automatic metrics and our innovative human metric are presented.The comprehensive scores of the three summaries generating tasks on testing datasets are 70.87,93.50,and 92.69,respectively.The results suggest that MDH can generate space weather summaries with high accuracy and coherence,as well as suitable length,which can assist forecasters in generating high-quality space weather forecast products,despite the data being starved.展开更多
With the warming up and continuous development of machine learning,especially deep learning,the research on visual question answering field has made significant progress,with important theoretical research significanc...With the warming up and continuous development of machine learning,especially deep learning,the research on visual question answering field has made significant progress,with important theoretical research significance and practical application value.Therefore,it is necessary to summarize the current research and provide some reference for researchers in this field.This article conducted a detailed and in-depth analysis and summarized of relevant research and typical methods of visual question answering field.First,relevant background knowledge about VQA(Visual Question Answering)was introduced.Secondly,the issues and challenges of visual question answering were discussed,and at the same time,some promising discussion on the particular methodologies was given.Thirdly,the key sub-problems affecting visual question answering were summarized and analyzed.Then,the current commonly used data sets and evaluation indicators were summarized.Next,in view of the popular algorithms and models in VQA research,comparison of the algorithms and models was summarized and listed.Finally,the future development trend and conclusion of visual question answering were prospected.展开更多
钻井顶部驱动装置结构复杂、故障类型多样,现有的故障树分析法和专家系统难以有效应对复杂多变的现场情况。为此,利用知识图谱在结构化与非结构化信息融合、故障模式关联分析以及先验知识传递方面的优势,提出了一种基于知识图谱的钻井...钻井顶部驱动装置结构复杂、故障类型多样,现有的故障树分析法和专家系统难以有效应对复杂多变的现场情况。为此,利用知识图谱在结构化与非结构化信息融合、故障模式关联分析以及先验知识传递方面的优势,提出了一种基于知识图谱的钻井顶部驱动装置故障诊断方法,利用以Transformer为基础的双向编码器模型(Bidirectional Encoder Representations from Transformers,BERT)构建了混合神经网络模型BERT-BiLSTM-CRF与BERT-BiLSTM-Attention,分别实现了顶驱故障文本数据的命名实体识别和关系抽取,并通过相似度计算,实现了故障知识的有效融合和智能问答,最终构建了顶部驱动装置故障诊断方法。研究结果表明:①在故障实体识别任务上,BERT-BiLSTM-CRF模型的精确度达到95.49%,能够有效识别故障文本中的信息实体;②在故障关系抽取上,BERT-BiLSTM-Attention模型的精确度达到93.61%,实现了知识图谱关系边的正确建立;③开发的问答系统实现了知识图谱的智能应用,其在多个不同类型问题上的回答准确率超过了90%,能够满足现场使用需求。结论认为,基于知识图谱的故障诊断方法能够有效利用顶部驱动装置的先验知识,实现故障的快速定位与智能诊断,具备良好的应用前景。展开更多
中文司法领域的实体和关系抽取技术在提高办案效率方面具有重要作用,但现有的关系抽取模型缺乏领域知识且难以处理重叠实体,造成难以准确区分和提取实体与关系等问题.通过引入领域知识,提出一种法律信息增强模块,增强了用所提法律潜在...中文司法领域的实体和关系抽取技术在提高办案效率方面具有重要作用,但现有的关系抽取模型缺乏领域知识且难以处理重叠实体,造成难以准确区分和提取实体与关系等问题.通过引入领域知识,提出一种法律信息增强模块,增强了用所提法律潜在关系与全局对应(legal potential relationship and global correspondence,LPRGC)模型理解法律文本中术语、规则和上下文信息的能力,从而提高了实体和关系的识别准确性,进而提升了实体和关系抽取算法的性能.为解决重叠实体问题,设计了一种基于潜在关系和实体对齐的关系抽取方法.通过精确标注实体位置,筛选潜在关系,并利用全局矩阵对齐实体,解决重叠实体的关系抽取问题,能够更准确地捕捉到重叠实体之间的关系,并有效地将其映射到正确的实体对上,从而提高抽取结果的准确性.在中国法律智能技术评测数据集上进行实体和关系抽取实验,结果表明,LPRGC模型的准确率、召回率和F_(1)值分别为85.21%、81.19%和83.15%,均优于对比模型,特别是在处理实体重叠问题时,LPRGC模型在单实体重叠类型的抽取中,F_(1)值达到了81.45%;在多实体重叠类型的抽取中,F_(1)值达80.67%.LPRGC模型在实体和关系抽取的准确性上较现有方法有明显改进,在处理复杂法律文本中的实体重叠问题上取得了显著效果.展开更多
自然语言处理是实现人机交互的关键步骤,而汉语自然语言处理(Chinese natural language processing,CNLP)是其中的重要组成部分。随着大模型技术的发展,CNLP进入了一个新的阶段,这些汉语大模型具备更强的泛化能力和更快的任务适应性。然...自然语言处理是实现人机交互的关键步骤,而汉语自然语言处理(Chinese natural language processing,CNLP)是其中的重要组成部分。随着大模型技术的发展,CNLP进入了一个新的阶段,这些汉语大模型具备更强的泛化能力和更快的任务适应性。然而,相较于英语大模型,汉语大模型在逻辑推理和文本理解能力方面仍存在不足。介绍了图神经网络在特定CNLP任务中的优势,进行了量子机器学习在CNLP发展潜力的调查。总结了大模型的基本原理和技术架构,详细整理了大模型评测任务的典型数据集和模型评价指标,评估比较了当前主流的大模型在CNLP任务中的效果。分析了当前CNLP存在的挑战,并对CNLP任务的未来研究方向进行了展望,希望能帮助解决当前CNLP存在的挑战,同时为新方法的提出提供了一定的参考。展开更多
基金Supported by the Key Research Program of the Chinese Academy of Sciences(ZDRE-KT-2021-3)。
文摘Both analyzing a large amount of space weather observed data and alleviating personal experience bias are significant challenges in generating artificial space weather forecast products.With the use of natural language generation methods based on the sequence-to-sequence model,space weather forecast texts can be automatically generated.To conduct our generation tasks at a fine-grained level,a taxonomy of space weather phenomena based on descriptions is presented.Then,our MDH(Multi-Domain Hybrid)model is proposed for generating space weather summaries in two stages.This model is composed of three sequence-to-sequence-based deep neural network sub-models(one Bidirectional Auto-Regressive Transformers pre-trained model and two Transformer models).Then,to evaluate how well MDH performs,quality evaluation metrics based on two prevalent automatic metrics and our innovative human metric are presented.The comprehensive scores of the three summaries generating tasks on testing datasets are 70.87,93.50,and 92.69,respectively.The results suggest that MDH can generate space weather summaries with high accuracy and coherence,as well as suitable length,which can assist forecasters in generating high-quality space weather forecast products,despite the data being starved.
基金Project(61702063)supported by the National Natural Science Foundation of China。
文摘With the warming up and continuous development of machine learning,especially deep learning,the research on visual question answering field has made significant progress,with important theoretical research significance and practical application value.Therefore,it is necessary to summarize the current research and provide some reference for researchers in this field.This article conducted a detailed and in-depth analysis and summarized of relevant research and typical methods of visual question answering field.First,relevant background knowledge about VQA(Visual Question Answering)was introduced.Secondly,the issues and challenges of visual question answering were discussed,and at the same time,some promising discussion on the particular methodologies was given.Thirdly,the key sub-problems affecting visual question answering were summarized and analyzed.Then,the current commonly used data sets and evaluation indicators were summarized.Next,in view of the popular algorithms and models in VQA research,comparison of the algorithms and models was summarized and listed.Finally,the future development trend and conclusion of visual question answering were prospected.
文摘钻井顶部驱动装置结构复杂、故障类型多样,现有的故障树分析法和专家系统难以有效应对复杂多变的现场情况。为此,利用知识图谱在结构化与非结构化信息融合、故障模式关联分析以及先验知识传递方面的优势,提出了一种基于知识图谱的钻井顶部驱动装置故障诊断方法,利用以Transformer为基础的双向编码器模型(Bidirectional Encoder Representations from Transformers,BERT)构建了混合神经网络模型BERT-BiLSTM-CRF与BERT-BiLSTM-Attention,分别实现了顶驱故障文本数据的命名实体识别和关系抽取,并通过相似度计算,实现了故障知识的有效融合和智能问答,最终构建了顶部驱动装置故障诊断方法。研究结果表明:①在故障实体识别任务上,BERT-BiLSTM-CRF模型的精确度达到95.49%,能够有效识别故障文本中的信息实体;②在故障关系抽取上,BERT-BiLSTM-Attention模型的精确度达到93.61%,实现了知识图谱关系边的正确建立;③开发的问答系统实现了知识图谱的智能应用,其在多个不同类型问题上的回答准确率超过了90%,能够满足现场使用需求。结论认为,基于知识图谱的故障诊断方法能够有效利用顶部驱动装置的先验知识,实现故障的快速定位与智能诊断,具备良好的应用前景。
文摘中文司法领域的实体和关系抽取技术在提高办案效率方面具有重要作用,但现有的关系抽取模型缺乏领域知识且难以处理重叠实体,造成难以准确区分和提取实体与关系等问题.通过引入领域知识,提出一种法律信息增强模块,增强了用所提法律潜在关系与全局对应(legal potential relationship and global correspondence,LPRGC)模型理解法律文本中术语、规则和上下文信息的能力,从而提高了实体和关系的识别准确性,进而提升了实体和关系抽取算法的性能.为解决重叠实体问题,设计了一种基于潜在关系和实体对齐的关系抽取方法.通过精确标注实体位置,筛选潜在关系,并利用全局矩阵对齐实体,解决重叠实体的关系抽取问题,能够更准确地捕捉到重叠实体之间的关系,并有效地将其映射到正确的实体对上,从而提高抽取结果的准确性.在中国法律智能技术评测数据集上进行实体和关系抽取实验,结果表明,LPRGC模型的准确率、召回率和F_(1)值分别为85.21%、81.19%和83.15%,均优于对比模型,特别是在处理实体重叠问题时,LPRGC模型在单实体重叠类型的抽取中,F_(1)值达到了81.45%;在多实体重叠类型的抽取中,F_(1)值达80.67%.LPRGC模型在实体和关系抽取的准确性上较现有方法有明显改进,在处理复杂法律文本中的实体重叠问题上取得了显著效果.
文摘自然语言处理是实现人机交互的关键步骤,而汉语自然语言处理(Chinese natural language processing,CNLP)是其中的重要组成部分。随着大模型技术的发展,CNLP进入了一个新的阶段,这些汉语大模型具备更强的泛化能力和更快的任务适应性。然而,相较于英语大模型,汉语大模型在逻辑推理和文本理解能力方面仍存在不足。介绍了图神经网络在特定CNLP任务中的优势,进行了量子机器学习在CNLP发展潜力的调查。总结了大模型的基本原理和技术架构,详细整理了大模型评测任务的典型数据集和模型评价指标,评估比较了当前主流的大模型在CNLP任务中的效果。分析了当前CNLP存在的挑战,并对CNLP任务的未来研究方向进行了展望,希望能帮助解决当前CNLP存在的挑战,同时为新方法的提出提供了一定的参考。