摘要
近年来,许多研究人员在多模态融合的研究中取得了显著效果。多模态比单模态具有更丰富的信息,然而,多模态融合过程中的类别共现频率偏差,使得服饰兼容度预测研究存在着巨大的挑战。因此,本研究提出了基于混合图神经网络的多模态相关性时尚服饰兼容度预测模型。该模型深度挖掘文本和视觉两个模态的相关性,并通过混合图神经网络解决多模态融合过程中类别共现频率偏差引起兼容度预测不准确的问题,提高服饰兼容度预测精度。该模型在Polyvore Outfits和Polyvore Outfits-D两个开源数据集上进行了服饰兼容度预测和填空任务的实验。结果显示,该模型在2个数据集中的服饰兼容度任务中分别取得了0.928和0.878的AUC值,在填空任务中分别取得了62.41%和56.83%的精确度,均优于比较的基准模型。
In recent years,significant progress has been made in the field of multimodal fusion research.Multimodal data provides a wealth of information compared to unimodal data.However,the category co-occurrence frequency bias during multimodal fusion makes clothing compatibility prediction studies challenging.Therefore,we propose a multimodal correlation fashion clothing compatibility prediction model based on a hybrid graph neural network.The model deeply exploits the correlation between textual and visual modalities,and solves the problem of inaccurate compatibility prediction caused by the category co-occurrence frequency bias in the process of multimodal fusion by hybrid graph neural network,so as to improve the accuracy of clothing compatibility prediction.The model underwent experiments on the Polyvore Outfits and Polyvore Outfits-D open-source datasets for tasks related to fashion compatibility prediction and fill-in-the-blank.The results show that the model achieved AUC values of 0.928 and 0.878 for the dress compatibility task in the two datasets,and 62.41%and 56.83%accuracy for the fill-in-the-blank task,surpassing the performance of the baseline models used for comparison.
作者
陈燕
吕梓民
李云
陆星宇
井佩光
CHEN Yan;L Zimin;LI Yun;LU Xingyu;JING Peiguang(School of Computer and Electronic Information,Guangxi University,Nanning,Guangxi 530004,China;School of Big Data and Artificial Intelligence,Guangxi University of Finance and Economics,Nanning,Guangxi 530003,China;Xiangsihu College of Guangxi University of Nationalities,School of Science and Engineering,Nanning,Guangxi 530003,China;School of Electrical Automation and Information Engineering,Tianjin University,Tianjin 300072,China)
出处
《北京服装学院学报(自然科学版)》
CAS
2024年第2期70-78,共9页
Journal of Beijing Institute of Fashion Technology:Natural Science Edition
基金
国家自然科学基金资助项目(71862003)
国家自然科学基金资助项目(61861014和6236010200)
博士启动基金(BS2021025)
广西自然科学基金(2020GXNSFAA159090)
广西科学研究与技术开发计划项目(AA20302002-3)
关键词
多模态
动态图神经网络
共现频率偏差
服饰兼容度
multimodality
dynamic graph neural network
co-occurrence frequency bias
clothing compatibility
作者简介
陈燕,E-mail:cy@gxu.edu.cn;通信作者:李云(1978-),女,博士,教授,E-mail:liyun@guat.edu.cn。