Both analyzing a large amount of space weather observed data and alleviating personal experience bias are significant challenges in generating artificial space weather forecast products.With the use of natural languag...Both analyzing a large amount of space weather observed data and alleviating personal experience bias are significant challenges in generating artificial space weather forecast products.With the use of natural language generation methods based on the sequence-to-sequence model,space weather forecast texts can be automatically generated.To conduct our generation tasks at a fine-grained level,a taxonomy of space weather phenomena based on descriptions is presented.Then,our MDH(Multi-Domain Hybrid)model is proposed for generating space weather summaries in two stages.This model is composed of three sequence-to-sequence-based deep neural network sub-models(one Bidirectional Auto-Regressive Transformers pre-trained model and two Transformer models).Then,to evaluate how well MDH performs,quality evaluation metrics based on two prevalent automatic metrics and our innovative human metric are presented.The comprehensive scores of the three summaries generating tasks on testing datasets are 70.87,93.50,and 92.69,respectively.The results suggest that MDH can generate space weather summaries with high accuracy and coherence,as well as suitable length,which can assist forecasters in generating high-quality space weather forecast products,despite the data being starved.展开更多
土木工程行业在信息化转型中面临着大量的非结构化的文本信息,大语言模型(large language models,LLMs)由于其强大的自然语言处理能力,为行业领域的智能化变革提供了新的机遇。采用系统性文献回顾的方法,在梳理LLMs的技术架构及在垂直...土木工程行业在信息化转型中面临着大量的非结构化的文本信息,大语言模型(large language models,LLMs)由于其强大的自然语言处理能力,为行业领域的智能化变革提供了新的机遇。采用系统性文献回顾的方法,在梳理LLMs的技术架构及在垂直领域研究现状的基础上,提出了LLMs在土木工程领域的四大应用场景及技术路线、面临的挑战及研究趋势。研究发现,LLMs已在土木工程领域有探索性的研究与应用,目前主要集中在内容生成类、智能问答类、文本摘要类及分析推理类四大应用场景,覆盖土木工程项目全生命周期阶段,并具有跨学科、跨模态融合的特性。然而,LLMs的应用仍面临知识专业性低、信息时效性差、数据质量及交互性低等挑战。基于此,提出了一系列未来研究机遇,在模型优化方面,利用参数高效微调技术注入专业知识,增强LLMs在土木工程领域应用的广度和深度;与知识图谱结合,提升LLMs在回答中的精准性、可解释性与时效性;融合多模态的数据类型,扩展LLMs在土木工程领域的应用场景;开发适用的模型评估方法,量化LLMs在土木工程领域应用的价值及性能表现。在应用场景方面,结合LLMs和土木工程领域特点,可以拓展LLMs在文档生成、问答系统、信息抽取、合规性审查等复杂任务中的应用,提高从业者与数据间的交互效率。研究旨在为学术界和企业界进一步将LLMs应用于土木工程领域提供借鉴与参考。展开更多
为了解决基于LCA(Lower Common Ancestor)的XML关键字查询丢失语义的问题,提出了一种基于"自然语言生成技术(Natural Language Generation,NLG)"的XML关键字查询技术,将NLG的内容规划应用到XML文档,产生针对用户查询的消息语...为了解决基于LCA(Lower Common Ancestor)的XML关键字查询丢失语义的问题,提出了一种基于"自然语言生成技术(Natural Language Generation,NLG)"的XML关键字查询技术,将NLG的内容规划应用到XML文档,产生针对用户查询的消息语句集,通过对消息语句集的筛选既可以实现基于语义的XML关键字查询,又可以极大地提高查询效率。展开更多
SHTQS is an intelligent telephone-besed spoken dialyze system providing the infomation about the best route between two sites in Shanghai. Instead of separated parts of speech decoding and language parsing, a close co...SHTQS is an intelligent telephone-besed spoken dialyze system providing the infomation about the best route between two sites in Shanghai. Instead of separated parts of speech decoding and language parsing, a close cool,ration is carded out in SHTQS by integrating automatic speech recognizer (AS,R), language understanding, dialogue management and speech generatot. In such a way, the erroneous analysis and uncertainty happening in the preceding stages would be recovered and determined acourately with high-level knowledge, Moreover, instead of shallow word-level analysis or simply keyword or key phrase matching, a deeper analysis is performed in our system by integrating a robust parser and a semantic interpreter. The robust parser is particularly important for spontanecos speech inputs because most of the inquiry sentences/phrases are ill-formed. In addition, in designinga mixed-initiative dialogue system, understanding users' inquiries is essential; however, simply matching keywords and/or key phrases can hardly achieve this. Therefore, a semantic interpreter is incorporated in oar system. The performnce of is also evaluated. The dialogue efficiency is 4.4 sentences per query on an average and the case precision rate of language understanding module is up to 81%. The results are satisfactory.展开更多
基金Supported by the Key Research Program of the Chinese Academy of Sciences(ZDRE-KT-2021-3)。
文摘Both analyzing a large amount of space weather observed data and alleviating personal experience bias are significant challenges in generating artificial space weather forecast products.With the use of natural language generation methods based on the sequence-to-sequence model,space weather forecast texts can be automatically generated.To conduct our generation tasks at a fine-grained level,a taxonomy of space weather phenomena based on descriptions is presented.Then,our MDH(Multi-Domain Hybrid)model is proposed for generating space weather summaries in two stages.This model is composed of three sequence-to-sequence-based deep neural network sub-models(one Bidirectional Auto-Regressive Transformers pre-trained model and two Transformer models).Then,to evaluate how well MDH performs,quality evaluation metrics based on two prevalent automatic metrics and our innovative human metric are presented.The comprehensive scores of the three summaries generating tasks on testing datasets are 70.87,93.50,and 92.69,respectively.The results suggest that MDH can generate space weather summaries with high accuracy and coherence,as well as suitable length,which can assist forecasters in generating high-quality space weather forecast products,despite the data being starved.
文摘土木工程行业在信息化转型中面临着大量的非结构化的文本信息,大语言模型(large language models,LLMs)由于其强大的自然语言处理能力,为行业领域的智能化变革提供了新的机遇。采用系统性文献回顾的方法,在梳理LLMs的技术架构及在垂直领域研究现状的基础上,提出了LLMs在土木工程领域的四大应用场景及技术路线、面临的挑战及研究趋势。研究发现,LLMs已在土木工程领域有探索性的研究与应用,目前主要集中在内容生成类、智能问答类、文本摘要类及分析推理类四大应用场景,覆盖土木工程项目全生命周期阶段,并具有跨学科、跨模态融合的特性。然而,LLMs的应用仍面临知识专业性低、信息时效性差、数据质量及交互性低等挑战。基于此,提出了一系列未来研究机遇,在模型优化方面,利用参数高效微调技术注入专业知识,增强LLMs在土木工程领域应用的广度和深度;与知识图谱结合,提升LLMs在回答中的精准性、可解释性与时效性;融合多模态的数据类型,扩展LLMs在土木工程领域的应用场景;开发适用的模型评估方法,量化LLMs在土木工程领域应用的价值及性能表现。在应用场景方面,结合LLMs和土木工程领域特点,可以拓展LLMs在文档生成、问答系统、信息抽取、合规性审查等复杂任务中的应用,提高从业者与数据间的交互效率。研究旨在为学术界和企业界进一步将LLMs应用于土木工程领域提供借鉴与参考。
文摘为了解决基于LCA(Lower Common Ancestor)的XML关键字查询丢失语义的问题,提出了一种基于"自然语言生成技术(Natural Language Generation,NLG)"的XML关键字查询技术,将NLG的内容规划应用到XML文档,产生针对用户查询的消息语句集,通过对消息语句集的筛选既可以实现基于语义的XML关键字查询,又可以极大地提高查询效率。
文摘SHTQS is an intelligent telephone-besed spoken dialyze system providing the infomation about the best route between two sites in Shanghai. Instead of separated parts of speech decoding and language parsing, a close cool,ration is carded out in SHTQS by integrating automatic speech recognizer (AS,R), language understanding, dialogue management and speech generatot. In such a way, the erroneous analysis and uncertainty happening in the preceding stages would be recovered and determined acourately with high-level knowledge, Moreover, instead of shallow word-level analysis or simply keyword or key phrase matching, a deeper analysis is performed in our system by integrating a robust parser and a semantic interpreter. The robust parser is particularly important for spontanecos speech inputs because most of the inquiry sentences/phrases are ill-formed. In addition, in designinga mixed-initiative dialogue system, understanding users' inquiries is essential; however, simply matching keywords and/or key phrases can hardly achieve this. Therefore, a semantic interpreter is incorporated in oar system. The performnce of is also evaluated. The dialogue efficiency is 4.4 sentences per query on an average and the case precision rate of language understanding module is up to 81%. The results are satisfactory.