期刊文献+

基于大模型微调范式的绘画风格模拟方法 被引量:2

Painting style simulation method based on fine-tuning paradigm for large models
在线阅读 下载PDF
导出
摘要 现有的微调大模型生成指定风格化图像的方法能力有限,存在布局风格、细节风格同目标风格不一致的问题,为了提高大模型风格模拟一致性能力,提出一种微调大模型部分注意力参数和低秩自适应(LoRA)相结合的方法并用于红色山水风格的绘画生成。首先,以少量绘画样本微调文生图大模型的部分注意力参数;其次,冻结文生图(文本-图像)大模型,结合LoRA微调方法在模型中注入可训练层进行训练;最后,将第2步中的可训练层插入第1步微调过的大模型中进行推理。实验结果表明,与目前流行的风格定制方法相比,所提方法在保证文本可控性的同时,不仅保证了绘画的整体布局与训练集图像风格一致,同时在绘画细节上也与目标风格具有高一致性。在红色山水画风格模拟方面的实验结果表明,生成的红色山水画更接近训练集的风格,并且生成的绘画风格更符合艺术从业者的风格一致性评价。目前,基于所提方法的红色山水画互动生成系统已经在中国共产党杭州历史馆对外开放展示。 The current methodologies for fine-tuning large models to generate images in a specified style exhibit limited capabilities,resulting in inconsistencies in layout style,detail style,and target style.To address this limitation and improve the overall consistency in style simulation for large models,a new approach that combines fine-tuning of partial attention parameters and Low-Rank Adaption(LoRA)for large models was proposed for generation of paintings in the Red-culture Landscape style.Firstly,the partial attention parameters of a text-to-image large model was fine-tuned by using a limited number of painting samples.Subsequently,the large model was frozen,the trainable layers were insert into the large model for further training by combining with LoRA fine-tuning approach.Finally,the trainable layers from the second step were integrated into the partially fine-tuned model from the first step for inference.Experimental results demonstrate that compared to current popular style transfer methods,the proposed approach not only ensures text controllability but also maintains overall layout consistency with the training dataset’s style.Additionally,it achieves high consistency in painting details with the target style.Experimental results in simulating the Red-culture Landscape painting style show that the generated paintings closely resemble the style of the training dataset,and the generated paintings meet the style consistency evaluation criteria of art professionals.Currently,a Red-culture Landscape painting interactive generation system based on the porposed approach has been publicly presented at the History Museum of CPC Hangzhou Branch.
作者 马诗洁 徐华艺 李聪聪 耿卫东 沈华清 李萌坚 MA Shijie;XU Huayi;LI Congcong;GENG Weidong;SHEN Huaqing;LI Mengjian(Zhejiang Lab,Hangzhou Zhejiang 311121,China)
机构地区 之江实验室
出处 《计算机应用》 CSCD 北大核心 2024年第S01期268-272,共5页 journal of Computer Applications
基金 之江实验室跨媒体智能短视频生成关键技术项目(108001‑AC2101)。
关键词 文本-图像大模型 微调 绘画风格 少量样本 图像生成 DreamBooth 低秩自适应 text-to-image large model fine-tuning painting style few-shot image generation DreamBooth Low-Rank Adaption(LoRA)
作者简介 马诗洁(1990-),女,山东潍坊人,工程师,硕士,主要研究方向:跨模态视觉内容生成;徐华艺(1994-),男,河南许昌人,硕士,主要研究方向:图像和视频生成;李聪聪(1997-),男,江西赣州人,硕士,主要研究方向:可控视觉内容生成;耿卫东(1967-),男,江苏射阳人,教授,博士,CCF会员,主要研究方向:CAD&CG、人工智能、多媒体计算、数码娱乐;沈华清(1968-),男,浙江台州人,副教授,硕士,主要研究方向:数字艺术与动画;通信作者:李萌坚(1984-),男,江西赣州人,助理研究员,博士,CCF会员,主要研究方向计算机图形学、计算机视觉、跨模态的智能计算,电子邮箱limengjian@zhejianglab.com。
  • 相关文献

同被引文献11

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部