摘要
随着人工智能技术的兴起,图像特征提取技术和文本自动生成技术都得到了长足的进步,将两者结合的图像描述生成技术也越来越受到学术界和工业界的重视。图像到文本生成是一个综合性问题,涉及自然语言处理和计算机视觉等领域。本文介绍了图像描述生成技术的研究背景及国内外研究现状,概述了目前研究者评估生成图像描述质量的图像数据集,对现有模型进行了详细的分类概括:基于模板的图像描述生成方法、基于检索的图像描述生成方法、基于深度学习的图像描述生成方法。与此同时一并总结阐述了该领域面临的问题和挑战。
Image caption generation technology is used in many fields such as news communication,smart transportation,smart home and smart medical.Therefore,this technology has important academic and practical value.Image-to-text generation is a comprehensive problem involving areas such as natural language processing and computer vision.This paper introduces the research background of image caption generation technology and the research status at home and abroad,and summarizes the current image datasets that researchers evaluate to generate quality of the image caption.The existing models are classified and summarized in detail:template-based image caption generation method,retrieval-based image caption generation method and deep-learning-based image caption generation method.It also summarizes the problems and challenges which the field is facing.
作者
张姣
杨振宇
ZHANG Jiao;YANG Zhenyu(Qilu University of Technology(Shandong Academy of Sciences),Jinan 250353,China)
出处
《智能计算机与应用》
2019年第5期45-49,共5页
Intelligent Computer and Applications
基金
山东省自然科学基金(ZR2017LF021)
山东省重点研究发展计划(2017XCGC0605)
关键词
图像描述
文本生成
特征提取
计算机视觉
image caption
text generation
feature extraction
computer vision
作者简介
张姣(1993-),女,硕士研究生,主要研究方向:深度学习、大数据智能制造与分析;杨振宇(1980-),男,博士,副教授,主要研究方向:深度学习、强化学习、人工智能与大数据。