The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduce...The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.展开更多
高效视频编码(high efficiency video coding,HEVC)相较于上一代编码标准H.264降低了约50%的比特率,但为了提高帧内预测的准确性,HEVC提出的35种预测模式导致计算量大幅增加,对软件和硬件实现均构成了挑战.针对该问题,在HEVC的基础上提...高效视频编码(high efficiency video coding,HEVC)相较于上一代编码标准H.264降低了约50%的比特率,但为了提高帧内预测的准确性,HEVC提出的35种预测模式导致计算量大幅增加,对软件和硬件实现均构成了挑战.针对该问题,在HEVC的基础上提出了一种依据图片纹理方向,结合预测模式之间的关联性来确定帧内预测模式的快速算法.实验结果表明,本算法与HEVC参考软件HM16.20相比,在BD-Rate损失仅为5.79%的情况下,节省46%以上的编码时间,显著降低了帧内预测模式决策的复杂度,便于在嵌入式系统等硬件资源有限的端侧实现算法落地.展开更多
近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能...近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能视频编码(Versatile Video Coding,VVC/H.266),主要针对人眼视觉特性进行优化,未能充分考虑压缩对机器视觉任务的性能影响。为解决这一问题,本文以多目标跟踪作为典型的机器视觉视频处理任务,提出一种面向机器视觉的VVC帧内编码算法。首先,使用神经网络可解释性方法,梯度加权类激活映射(Gradient-weighted Class Activation Mapping,GradCAM++),对视频内容进行显著性分析,定位出机器视觉任务所关注的区域,并以显著图的形式表示。随后,为了突出视频画面中的关键边缘轮廓信息,本文引入边缘检测并将其结果与显著性分析结果进行融合,得到最终的机器视觉显著性图。最后,基于融合后的机器视觉显著性图改进VVC模式选择过程,优化VVC中的块划分和帧内预测的模式决策过程。通过引入机器视觉失真,代替原有的信号失真来调整率失真优化公式,使得编码器在压缩过程中尽可能保留对视觉任务更为相关的信息。实验结果表明,与VVC基准相比,所提出方法在保持相同机器视觉检测精度的同时,可节约12.7%的码率。展开更多
针对通用视频编码(versatile video coding,VVC)在编码单元(coding unit,CU)划分中引入了多类型树划分结构导致编码复杂度增加的问题,提出了一种基于CU子块方向特性与空间复杂度的快速划分算法。首先利用CU整体的纹理复杂度对当前CU进...针对通用视频编码(versatile video coding,VVC)在编码单元(coding unit,CU)划分中引入了多类型树划分结构导致编码复杂度增加的问题,提出了一种基于CU子块方向特性与空间复杂度的快速划分算法。首先利用CU整体的纹理复杂度对当前CU进行分类,筛选出不划分CU;然后利用子块不同划分方向的特性差异提前决策CU划分方向;最后利用CU中间区域与边缘区域的复杂度差异特征判断是否跳过三叉树(ternary tree,TT)划分,进一步减少候选列表划分模式数量。实验结果表明,与官方测试平台VTM10.0相比,编码器在平均输出比特率增加1.12%的代价下,编码时间减少了40.25%,说明该算法在通用视频编码中能以较小的质量损失实现更短的编码时间。展开更多
为解决当前视频重压缩取证方法没有考虑色度域信息、取证准确度低的问题,提出一种面向最新多用途视频编码(versatile video coding,VVC)标准色度域亮度域信息融合的监控视频重压缩取证方法(CLF-SVRF)。基于VVC标准的编码原理,从监控视...为解决当前视频重压缩取证方法没有考虑色度域信息、取证准确度低的问题,提出一种面向最新多用途视频编码(versatile video coding,VVC)标准色度域亮度域信息融合的监控视频重压缩取证方法(CLF-SVRF)。基于VVC标准的编码原理,从监控视频的色度域和亮度域维度分析并确定VVC视频码流中与压缩次数密切相关的基础码流特征;基础码流特征包括色度域和亮度域编码单元(coding unit,CU)的划分类型及预测模式;结合拉格朗日率失真优化技术分析随着压缩次数的增加,色度域亮度域CU划分类型和预测模式的变化;进一步确定色度域亮度域CU划分类型和预测模式可以作为检测视频压缩次数的基础码流特征;接着考虑视频监控应用对重压缩取证方法低复杂度的需求,基于色度域亮度域CU划分类型和预测模式构建低复杂度高级码流特征;将高级码流特征输入支持向量机完成监控视频的重压缩取证。实验结果表明,与当前先进方法相比,CLF-SVRF方法的监控视频重压缩取证准确度平均提升了13.53%,同时可以大幅度地降低重压缩取证耗时,重压缩取证时间平均减少了47.42%。展开更多
基金Project(08Y29-7)supported by the Transportation Science and Research Program of Jiangsu Province,ChinaProject(201103051)supported by the Major Infrastructure Program of the Health Monitoring System Hardware Platform Based on Sensor Network Node,China+1 种基金Project(61100111)supported by the National Natural Science Foundation of ChinaProject(BE2011169)supported by the Scientific and Technical Supporting Program of Jiangsu Province,China
文摘The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.
文摘高效视频编码(high efficiency video coding,HEVC)相较于上一代编码标准H.264降低了约50%的比特率,但为了提高帧内预测的准确性,HEVC提出的35种预测模式导致计算量大幅增加,对软件和硬件实现均构成了挑战.针对该问题,在HEVC的基础上提出了一种依据图片纹理方向,结合预测模式之间的关联性来确定帧内预测模式的快速算法.实验结果表明,本算法与HEVC参考软件HM16.20相比,在BD-Rate损失仅为5.79%的情况下,节省46%以上的编码时间,显著降低了帧内预测模式决策的复杂度,便于在嵌入式系统等硬件资源有限的端侧实现算法落地.
文摘近年来,随着计算机视觉在智能监控、自动驾驶等领域的广泛应用,越来越多视频不仅用于人类观看,还可直接由机器视觉算法进行自动分析。如何高效地面向机器视觉存储和传输此类视频成为新的挑战。然而,现有的视频编码标准,如最新的多功能视频编码(Versatile Video Coding,VVC/H.266),主要针对人眼视觉特性进行优化,未能充分考虑压缩对机器视觉任务的性能影响。为解决这一问题,本文以多目标跟踪作为典型的机器视觉视频处理任务,提出一种面向机器视觉的VVC帧内编码算法。首先,使用神经网络可解释性方法,梯度加权类激活映射(Gradient-weighted Class Activation Mapping,GradCAM++),对视频内容进行显著性分析,定位出机器视觉任务所关注的区域,并以显著图的形式表示。随后,为了突出视频画面中的关键边缘轮廓信息,本文引入边缘检测并将其结果与显著性分析结果进行融合,得到最终的机器视觉显著性图。最后,基于融合后的机器视觉显著性图改进VVC模式选择过程,优化VVC中的块划分和帧内预测的模式决策过程。通过引入机器视觉失真,代替原有的信号失真来调整率失真优化公式,使得编码器在压缩过程中尽可能保留对视觉任务更为相关的信息。实验结果表明,与VVC基准相比,所提出方法在保持相同机器视觉检测精度的同时,可节约12.7%的码率。
基金Supported by National Natural Science Foundation of China(61170147) Major Cooperation Project of Production and College in Fujian Province(2012H61010016) Natural Science Foundation of Fujian Province(2013J01234)
文摘针对通用视频编码(versatile video coding,VVC)在编码单元(coding unit,CU)划分中引入了多类型树划分结构导致编码复杂度增加的问题,提出了一种基于CU子块方向特性与空间复杂度的快速划分算法。首先利用CU整体的纹理复杂度对当前CU进行分类,筛选出不划分CU;然后利用子块不同划分方向的特性差异提前决策CU划分方向;最后利用CU中间区域与边缘区域的复杂度差异特征判断是否跳过三叉树(ternary tree,TT)划分,进一步减少候选列表划分模式数量。实验结果表明,与官方测试平台VTM10.0相比,编码器在平均输出比特率增加1.12%的代价下,编码时间减少了40.25%,说明该算法在通用视频编码中能以较小的质量损失实现更短的编码时间。
文摘为解决当前视频重压缩取证方法没有考虑色度域信息、取证准确度低的问题,提出一种面向最新多用途视频编码(versatile video coding,VVC)标准色度域亮度域信息融合的监控视频重压缩取证方法(CLF-SVRF)。基于VVC标准的编码原理,从监控视频的色度域和亮度域维度分析并确定VVC视频码流中与压缩次数密切相关的基础码流特征;基础码流特征包括色度域和亮度域编码单元(coding unit,CU)的划分类型及预测模式;结合拉格朗日率失真优化技术分析随着压缩次数的增加,色度域亮度域CU划分类型和预测模式的变化;进一步确定色度域亮度域CU划分类型和预测模式可以作为检测视频压缩次数的基础码流特征;接着考虑视频监控应用对重压缩取证方法低复杂度的需求,基于色度域亮度域CU划分类型和预测模式构建低复杂度高级码流特征;将高级码流特征输入支持向量机完成监控视频的重压缩取证。实验结果表明,与当前先进方法相比,CLF-SVRF方法的监控视频重压缩取证准确度平均提升了13.53%,同时可以大幅度地降低重压缩取证耗时,重压缩取证时间平均减少了47.42%。