自监督单目深度估计受到了国内外研究人员的广泛关注。现有基于深度学习的自监督单目深度估计方法主要采用编码器-解码器结构。然而,这些方法在编码过程中对输入图像进行下采样操作,导致部分图像信息,尤其是图像的边界信息丢失,进而影...自监督单目深度估计受到了国内外研究人员的广泛关注。现有基于深度学习的自监督单目深度估计方法主要采用编码器-解码器结构。然而,这些方法在编码过程中对输入图像进行下采样操作,导致部分图像信息,尤其是图像的边界信息丢失,进而影响深度图的精度。针对上述问题,提出一种基于拉普拉斯金字塔的自监督单目深度估计方法(Self-supervised Monocular Depth Estimation Based on the Laplace Pyramid,LpDepth)。此方法的核心思想是:首先,使用拉普拉斯残差图丰富编码特征,以弥补在下采样过程中丢失的特征信息;其次,在下采样过程中使用最大池化层突显和放大特征信息,使编码器在特征提取过程中更容易地提取到训练模型所需要的特征信息;最后,使用残差模块解决过拟合问题,提高解码器对特征的利用效率。在KITTI和Make3D等数据集上对所提方法进行了测试,同时将其与现有经典方法进行了比较。实验结果证明了所提方法的有效性。展开更多
图像深度信息获取是机器视觉领域的活跃研究课题之一。将图像深度估计问题归结为模式识别问题,以单目图像深度为模式类,在多尺度下从图像块中提取绝对和相对深度特征,并选择表征上下文关系的DRF(Discriminative Random Field)方法来表...图像深度信息获取是机器视觉领域的活跃研究课题之一。将图像深度估计问题归结为模式识别问题,以单目图像深度为模式类,在多尺度下从图像块中提取绝对和相对深度特征,并选择表征上下文关系的DRF(Discriminative Random Field)方法来表述某图像块的深度和其邻域深度之间的关系,从而构建起基于DRF-MAP(Maximum a posteriori)的单目图像深度估计模型。通过实验,得到了一类单目图像对应的深度图像,从而证明了单目图像深度估计模型对应的改进算法的有效性。展开更多
Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the ha...Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the hand in captured images or videos. A new three-stage pipeline approach for fast and accurate hand segmentation for the hand from a single depth image is proposed. Firstly, a depth frame is segmented into several regions by histogrambased threshold selection algorithm and by tracing the exterior boundaries of objects after thresholding. Secondly, each segmentation proposal is evaluated by a three-layers shallow convolutional neural network(CNN) to determine whether or not the boundary is associated with the hand. Finally, all hand components are merged as the hand segmentation result. Compared with algorithms based on random decision forest(RDF), the experimental results demonstrate that the approach achieves better performance with high-accuracy(88.34% mean intersection over union, mIoU) and a shorter processing time(≤8 ms).展开更多
文摘自监督单目深度估计受到了国内外研究人员的广泛关注。现有基于深度学习的自监督单目深度估计方法主要采用编码器-解码器结构。然而,这些方法在编码过程中对输入图像进行下采样操作,导致部分图像信息,尤其是图像的边界信息丢失,进而影响深度图的精度。针对上述问题,提出一种基于拉普拉斯金字塔的自监督单目深度估计方法(Self-supervised Monocular Depth Estimation Based on the Laplace Pyramid,LpDepth)。此方法的核心思想是:首先,使用拉普拉斯残差图丰富编码特征,以弥补在下采样过程中丢失的特征信息;其次,在下采样过程中使用最大池化层突显和放大特征信息,使编码器在特征提取过程中更容易地提取到训练模型所需要的特征信息;最后,使用残差模块解决过拟合问题,提高解码器对特征的利用效率。在KITTI和Make3D等数据集上对所提方法进行了测试,同时将其与现有经典方法进行了比较。实验结果证明了所提方法的有效性。
文摘图像深度信息获取是机器视觉领域的活跃研究课题之一。将图像深度估计问题归结为模式识别问题,以单目图像深度为模式类,在多尺度下从图像块中提取绝对和相对深度特征,并选择表征上下文关系的DRF(Discriminative Random Field)方法来表述某图像块的深度和其邻域深度之间的关系,从而构建起基于DRF-MAP(Maximum a posteriori)的单目图像深度估计模型。通过实验,得到了一类单目图像对应的深度图像,从而证明了单目图像深度估计模型对应的改进算法的有效性。
文摘Real-time hand gesture recognition technology significantly improves the user's experience for virtual reality/augmented reality(VR/AR) applications, which relies on the identification of the orientation of the hand in captured images or videos. A new three-stage pipeline approach for fast and accurate hand segmentation for the hand from a single depth image is proposed. Firstly, a depth frame is segmented into several regions by histogrambased threshold selection algorithm and by tracing the exterior boundaries of objects after thresholding. Secondly, each segmentation proposal is evaluated by a three-layers shallow convolutional neural network(CNN) to determine whether or not the boundary is associated with the hand. Finally, all hand components are merged as the hand segmentation result. Compared with algorithms based on random decision forest(RDF), the experimental results demonstrate that the approach achieves better performance with high-accuracy(88.34% mean intersection over union, mIoU) and a shorter processing time(≤8 ms).