摘要
YOLOv5模型是目前文本检测较好的模型之一,针对文本目标长度不一,文本轮廓难以精准检测以及受自然场景中文字倾斜、光影的影响文本较难检测的问题,提出了R-YOLOv5(Rotated-YOLOv5)文本检测模型。首先融入基于仿射算法的文本分割模型,将图片的文本区域等比例切割为多个单字符块,解决文本没有闭合轮廓导致的YOLOv5模型锚定框拟合效果不佳的问题;然后使用旋转卷积层、旋转池化层、改进锚定框,提出了加强角度学习的RIoU(Rotated Intersection over Union)损失函数,实现了文本旋转倾斜特征的提取。在ICDAR2019-LSVT上对原模型与改进后的模型进行实验,实验结果显示,RYOLOv5检测效果有较明显的提升,但由于模型层数加深,训练速率与检测速率相比原模型有小幅降低。相比其他模型,由于YOLOv5自身的优点,R-YOLOv5的检测效果与检测速度均远好于其他模型。
YOLOv5 model is currently one of the best models for object detection.To solve the problem of different lengths of text lines,the inclination of text,light and shadow in natural scenes,etc.the R-YOLOv5(Rotated-YOLOv5)text detection model is proposed,which improves the YOLOv5 model to deal with the weakness in text detection.Firstly,the text segmentation model based on affine algorithm is incorporated.According to the length of the string and the shape of the text area,the text area of the picture is cut into multiple single-character blocks in equal proportions to solve the problem of poor effect of YOLOv5 model caused by the text objects without closed contour lines.Then,using the rotated convolutional neural network layer,rotated maxpooling layer and improved anchor box,we propose a rotated intersection over union(RIoU)loss function that strengthens angle learning to achieve the extraction of rotation and tilt features.The original model and the improved model are tested on ICDAR2019-LSVT.Experimental results show that the detection effect of R-YOLOv5 are significantly improved.However,due to the deepening of model layers,the training efficiency and detection efficiency are slightly reduced compared with the original model.Compared with other models,due to the advantages of YOLOv5,the detection effect and efficiency of R-YOLOv5are much better than that of other models.
作者
冉煜
张莉
RAN Yu;ZHANG Li(School of Information Technology and Management,University of International Business and Economics,Beijing 100029,China)
出处
《计算机科学》
CSCD
北大核心
2022年第S02期637-642,共6页
Computer Science
作者简介
冉煜,2920763948@qq.com,born in 1999,postgraduate.His main research interests include deep learning and object detection;通信作者:张莉,zhangli_amy@uibe.edu.cn,born in 1972,Ph.D,professor.Her main research interests include machine learning,deep learning,business intelligence and etc.