The cross-domain capabilities of aerial-aquatic vehicles(AAVs)hold significant potential for future airsea integrated combat operations.However,the failure rate of AAVs is higher than that of unmanned systems operatin...The cross-domain capabilities of aerial-aquatic vehicles(AAVs)hold significant potential for future airsea integrated combat operations.However,the failure rate of AAVs is higher than that of unmanned systems operating in a single medium.To ensure the reliable and stable completion of tasks by AAVs,this paper proposes a tiltable quadcopter AAV to mitigate the potential issue of rotor failure,which can lead to high-speed spinning or damage during cross-media transitions.Experimental validation demonstrates that this tiltable quadcopter AAV can transform into a dual-rotor or triple-rotor configuration after losing one or two rotors,allowing it to perform cross-domain movements with enhanced stability and maintain task completion.This enhancement significantly improves its fault tolerance and task reliability.展开更多
[目的/意义]近年来,随着社交媒体平台的快速发展,多模态命名实体识别(Multimodal Named Entity Recognition,MNER)成为一个备受关注的研究课题。最新研究表明,基于视觉Transformer的视觉语言模型在性能上优于传统的基于目标检测器的方法...[目的/意义]近年来,随着社交媒体平台的快速发展,多模态命名实体识别(Multimodal Named Entity Recognition,MNER)成为一个备受关注的研究课题。最新研究表明,基于视觉Transformer的视觉语言模型在性能上优于传统的基于目标检测器的方法,但目前尚缺乏对基于视觉语言Transformer的MNER模型的系统性研究。[方法/过程]为了解决上述问题,本文提出一种新的端到端框架,旨在深入研究如何设计和训练完全基于Transformer的视觉语言MNER模型。该框架充分考虑了模型设计中的所有关键要素,包括多模态特征提取、多模态融合模块以及解码架构。[结果/结论]实验结果表明,本文模型的表现优于所有基线模型,包括基于大语言模型的方法,并在两个数据集上取得了最佳整体指标。具体而言,该模型在Twitter-2015和Twitter-2017数据集上分别获得了80.06%和94.27%的整体F1分数,相较于目前最先进的视觉语言模型,分别提高了1.34%和3.80%。此外,该模型在跨数据集评估中表现出优于基线模型的出色泛化能力。展开更多
基金supported by Southern Marine Science and Engineering Guangdong Laboratory Grant No.SML2023SP229。
文摘The cross-domain capabilities of aerial-aquatic vehicles(AAVs)hold significant potential for future airsea integrated combat operations.However,the failure rate of AAVs is higher than that of unmanned systems operating in a single medium.To ensure the reliable and stable completion of tasks by AAVs,this paper proposes a tiltable quadcopter AAV to mitigate the potential issue of rotor failure,which can lead to high-speed spinning or damage during cross-media transitions.Experimental validation demonstrates that this tiltable quadcopter AAV can transform into a dual-rotor or triple-rotor configuration after losing one or two rotors,allowing it to perform cross-domain movements with enhanced stability and maintain task completion.This enhancement significantly improves its fault tolerance and task reliability.
文摘[目的/意义]近年来,随着社交媒体平台的快速发展,多模态命名实体识别(Multimodal Named Entity Recognition,MNER)成为一个备受关注的研究课题。最新研究表明,基于视觉Transformer的视觉语言模型在性能上优于传统的基于目标检测器的方法,但目前尚缺乏对基于视觉语言Transformer的MNER模型的系统性研究。[方法/过程]为了解决上述问题,本文提出一种新的端到端框架,旨在深入研究如何设计和训练完全基于Transformer的视觉语言MNER模型。该框架充分考虑了模型设计中的所有关键要素,包括多模态特征提取、多模态融合模块以及解码架构。[结果/结论]实验结果表明,本文模型的表现优于所有基线模型,包括基于大语言模型的方法,并在两个数据集上取得了最佳整体指标。具体而言,该模型在Twitter-2015和Twitter-2017数据集上分别获得了80.06%和94.27%的整体F1分数,相较于目前最先进的视觉语言模型,分别提高了1.34%和3.80%。此外,该模型在跨数据集评估中表现出优于基线模型的出色泛化能力。