Traditional object detectors based on deep learning rely on plenty of labeled samples,which are expensive to obtain.Few-shot object detection(FSOD)attempts to solve this problem,learning detection objects from a few l...Traditional object detectors based on deep learning rely on plenty of labeled samples,which are expensive to obtain.Few-shot object detection(FSOD)attempts to solve this problem,learning detection objects from a few labeled samples,but the performance is often unsatisfactory due to the scarcity of samples.We believe that the main reasons that restrict the performance of few-shot detectors are:(1)the positive samples is scarce,and(2)the quality of positive samples is low.Therefore,we put forward a novel few-shot object detector based on YOLOv4,starting from both improving the quantity and quality of positive samples.First,we design a hybrid multivariate positive sample augmentation(HMPSA)module to amplify the quantity of positive samples and increase positive sample diversity while suppressing negative samples.Then,we design a selective non-local fusion attention(SNFA)module to help the detector better learn the target features and improve the feature quality of positive samples.Finally,we optimize the loss function to make it more suitable for the task of FSOD.Experimental results on PASCAL VOC and MS COCO demonstrate that our designed few-shot object detector has competitive performance with other state-of-the-art detectors.展开更多
In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have differ...In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have different orientations.Existing OBB object detection for remote sensing images,although making good progress,mainly focuses on directional modeling,while less consideration is given to the size of the object as well as the problem of missed detection.In this study,a method based on improved YOLOv8 was proposed for detecting oriented objects in remote sensing images,which can improve the detection precision of oriented objects in remote sensing images.Firstly,the ResCBAMG module was innovatively designed,which could better extract channel and spatial correlation information.Secondly,the innovative top-down feature fusion layer network structure was proposed in conjunction with the Efficient Channel Attention(ECA)attention module,which helped to capture inter-local cross-channel interaction information appropriately.Finally,we introduced an innovative ResCBAMG module between the different C2f modules and detection heads of the bottom-up feature fusion layer.This innovative structure helped the model to better focus on the target area.The precision and robustness of oriented target detection were also improved.Experimental results on the DOTA-v1.5 dataset showed that the detection Precision,mAP@0.5,and mAP@0.5:0.95 metrics of the improved model are better compared to the original model.This improvement is effective in detecting small targets and complex scenes.展开更多
A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on...A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on the static-world assumption.To cope with this challenging topic,a real-time and robust VSLAM system based on ORB-SLAM2 for dynamic environments was proposed.To reduce the influence of dynamic content,we incorporate the deep-learning-based object detection method in the visual odometry,then the dynamic object probability model is added to raise the efficiency of object detection deep neural network and enhance the real-time performance of our system.Experiment with both on the TUM and KITTI benchmark dataset,as well as in a real-world environment,the results clarify that our method can significantly reduce the tracking error or drift,enhance the robustness,accuracy and stability of the VSLAM system in dynamic scenes.展开更多
Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate ...Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate and the accuracy.A fast infrared small target detection method tailored for resource-constrained conditions is pro⁃posed for the YOLOv5s model.This method introduces an additional small target detection head and replaces the original Intersection over Union(IoU)metric with Normalized Wasserstein Distance(NWD),while considering both the detection accuracy and the detection speed of infrared small targets.Experimental results demonstrate that the proposed algorithm achieves a maximum effective detection speed of 95 FPS on a 15 W TPU,while reach⁃ing a maximum effective detection accuracy of 91.9 AP@0.5,effectively improving the efficiency of infrared small target detection under resource-constrained conditions.展开更多
Pine wood nematode infection is a devastating disease.Unmanned aerial vehicle(UAV)remote sensing enables timely and precise monitoring.However,UAV aerial images are challenged by small target size and complex sur-face...Pine wood nematode infection is a devastating disease.Unmanned aerial vehicle(UAV)remote sensing enables timely and precise monitoring.However,UAV aerial images are challenged by small target size and complex sur-face backgrounds which hinder their effectiveness in moni-toring.To address these challenges,based on the analysis and optimization of UAV remote sensing images,this study developed a spatio-temporal multi-scale fusion algorithm for disease detection.The multi-head,self-attention mechanism is incorporated to address the issue of excessive features generated by complex surface backgrounds in UAV images.This enables adaptive feature control to suppress redundant information and boost the model’s feature extraction capa-bilities.The SPD-Conv module was introduced to address the problem of loss of small target feature information dur-ing feature extraction,enhancing the preservation of key features.Additionally,the gather-and-distribute mechanism was implemented to augment the model’s multi-scale feature fusion capacity,preventing the loss of local details during fusion and enriching small target feature information.This study established a dataset of pine wood nematode disease in the Huangshan area using DJI(DJ-Innovations)UAVs.The results show that the accuracy of the proposed model with spatio-temporal multi-scale fusion reached 78.5%,6.6%higher than that of the benchmark model.Building upon the timeliness and flexibility of UAV remote sensing,the pro-posed model effectively addressed the challenges of detect-ing small and medium-size targets in complex backgrounds,thereby enhancing the detection efficiency for pine wood nematode disease.This facilitates early preemptive preser-vation of diseased trees,augments the overall monitoring proficiency of pine wood nematode diseases,and supplies technical aid for proficient monitoring.展开更多
The correct identification of traffic signs plays an important role in automatic driving technology and road safety driving.Therefore,to address the problems of misdetection and omission in traffic sign detection due ...The correct identification of traffic signs plays an important role in automatic driving technology and road safety driving.Therefore,to address the problems of misdetection and omission in traffic sign detection due to the variety of sign types,significant size differences and complex background information,an improved traffic sign detection model for RT-DETR was proposed in this study.Firstly,the HiLo attention mechanism was added to the Attention-based Intra-scale Feature Interaction,which further enhanced the feature extraction capability of the network and improved the detection efficiency on high-resolution images.Secondly,the CAFMFusion feature fusion mechanism was designed,which enabled the network to pay attention to the features in different regions in each channel.Based on this,the model could better capture the remote dependencies and neighborhood feature correlation,improving the feature fusion capability of the model.Finally,the MPDIoU was used as the loss function of the improved model to achieve faster convergence and more accurate regression results.The experimental results on the TT100k-2021 traffic sign dataset showed that the improved model achieves the performance with a precision value of 90.2%,recall value of 88.1%and mAP@0.5 value of 91.6%,which are 4.6%,5.8%,and 4.4%better than the original RT-DETR model respectively.The model effectively improves the problem of poor traffic sign detection and has greater practical value.展开更多
Existing almost deep learning methods rely on a large amount of annotated data, so they are inappropriate for forest fire smoke detection with limited data. In this paper, a novel hybrid attention-based few-shot learn...Existing almost deep learning methods rely on a large amount of annotated data, so they are inappropriate for forest fire smoke detection with limited data. In this paper, a novel hybrid attention-based few-shot learning method, named Attention-Based Prototypical Network, is proposed for forest fire smoke detection. Specifically, feature extraction network, which consists of convolutional block attention module, could extract high-level and discriminative features and further decrease the false alarm rate resulting from suspected smoke areas. Moreover, we design a metalearning module to alleviate the overfitting issue caused by limited smoke images, and the meta-learning network enables achieving effective detection via comparing the distance between the class prototype of support images and the features of query images. A series of experiments on forest fire smoke datasets and miniImageNet dataset testify that the proposed method is superior to state-of-the-art few-shot learning approaches.展开更多
The object detectors can precisely detect the camouflaged object beyond human perception.The investigations reveal that the CNNs-based(Convolution Neural Networks)detectors are vulnerable to adversarial attacks.Some w...The object detectors can precisely detect the camouflaged object beyond human perception.The investigations reveal that the CNNs-based(Convolution Neural Networks)detectors are vulnerable to adversarial attacks.Some works can fool detectors by crafting the adversarial camouflage attached to the object,leading to wrong prediction.It is hard for military operations to utilize the existing adversarial camouflage due to its conspicuous appearance.Motivated by this,this paper proposes the Dual Attribute Adversarial Camouflage(DAAC)for evading the detection by both detectors and humans.Generating DAAC includes two steps:(1)Extracting features from a specific type of scene to generate individual soldier digital camouflage;(2)Attaching the adversarial patch with scene features constraint to the individual soldier digital camouflage to generate the adversarial attribute of DAAC.The visual effects of the individual soldier digital camouflage and the adversarial patch will be improved after integrating with the scene features.Experiment results show that objects camouflaged by DAAC are well integrated with background and achieve visual concealment while remaining effective in fooling object detectors,thus evading the detections by both detectors and humans in the digital domain.This work can serve as the reference for crafting the adversarial camouflage in the physical world.展开更多
Inspired by eagle’s visual system,an eagle-vision-based object detection method for unmanned aerial vehicle(UAV)formation in hazy weather is proposed in this paper.To restore the hazy image,the values of atmospheric ...Inspired by eagle’s visual system,an eagle-vision-based object detection method for unmanned aerial vehicle(UAV)formation in hazy weather is proposed in this paper.To restore the hazy image,the values of atmospheric light and transmission are estimated on the basis of the signal processing mechanism of ON and OFF channels in eagle’s retina.Local features of the dehazed image are calculated according to the color antagonism mechanism and contrast sensitivity function of eagle’s visual system.A center-surround operation is performed to simulate the response of reception field.The final saliency map is generated by the Random Forest algorithm.Experimental results verify that the proposed method is capable to detect UAVs in hazy image and has superior performance over traditional methods.展开更多
The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,th...The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,the detection accuracy of some categories in the apron dataset was low.Therefore,an improved object detection method using spatial-aware features in apron scenes called SA-FRCNN is presented.The method uses graph convolutional networks to capture the relative spatial relationship between objects in the apron scene,incorporating this spatial context into feature learning.Moreover,an attention mechanism is introduced into the feature extraction process,with the goal to focus on the spatial position and key features,and distance-IoU loss is used to achieve a more accurate regression.The experimental results show that the mean average precision of the apron object detection based on SAFRCNN can reach 95.75%,and the detection effect of some hard-to-detect categories has been significantly improved.The proposed method effectively improves the detection accuracy on the apron dataset,which has a leading advantage over other methods.展开更多
A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence...A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.展开更多
The region completeness of object detection is very crucial to video surveillance, such as the pedestrian and vehicle identifications. However, many conventional object detection approaches cannot guarantee the object...The region completeness of object detection is very crucial to video surveillance, such as the pedestrian and vehicle identifications. However, many conventional object detection approaches cannot guarantee the object region completeness because the object detection can be influenced by the illumination variations and clustering backgrounds. In order to overcome this problem, we propose the iterative superpixels grouping (ISPG) method to extract the precise object boundary and generate the object region with high completeness after the object detection. First, by extending the superpixel segmentation method, the proposed ISPG method can improve the inaccurate segmentation problem and guarantee the region completeness on the object regions. Second, the multi- resolution superpixel-based region completeness enhancement method is proposed to extract the object region with high precision and completeness. The simulation results show that the proposed method outperforms the conventional object detection methods in terms of object completeness evaluation.展开更多
The integrity and fineness characterization of non-connected regions and contours is a major challenge for existing salient object detection.The key to address is how to make full use of the subjective and objective s...The integrity and fineness characterization of non-connected regions and contours is a major challenge for existing salient object detection.The key to address is how to make full use of the subjective and objective structural information obtained in different steps.Therefore,by simulating the human visual mechanism,this paper proposes a novel multi-decoder matching correction network and subjective structural loss.Specifically,the loss pays different attentions to the foreground,boundary,and background of ground truth map in a top-down structure.And the perceived saliency is mapped to the corresponding objective structure of the prediction map,which is extracted in a bottom-up manner.Thus,multi-level salient features can be effectively detected with the loss as constraint.And then,through the mapping of improved binary cross entropy loss,the differences between salient regions and objects are checked to pay attention to the error prone region to achieve excellent error sensitivity.Finally,through tracking the identifying feature horizontally and vertically,the subjective and objective interaction is maximized.Extensive experiments on five benchmark datasets demonstrate that compared with 12 state-of-the-art methods,the algorithm has higher recall and precision,less error and strong robustness and generalization ability,and can predict complete and refined saliency maps.展开更多
With the increasing number of vehicles,manual security inspections are becoming more laborious at road checkpoints.To address it,a specialized Road Checkpoints Robot(RCRo)system is proposed,incorporated with enhanced ...With the increasing number of vehicles,manual security inspections are becoming more laborious at road checkpoints.To address it,a specialized Road Checkpoints Robot(RCRo)system is proposed,incorporated with enhanced You Only Look Once(YOLO)and a 6-degree-of-freedom(DOF)manipulator,for autonomous identity verification and vehicle inspection.The modified YOLO is characterized by large objects’sensitivity and faster detection speed,named“LF-YOLO”.The better sensitivity of large objects and the faster detection speed are achieved by means of the Dense module-based backbone network connecting two-scale detecting network,for object detection tasks,along with optimized anchor boxes and improved loss function.During the manipulator motion,Octree-aided motion control scheme is adopted for collision-free motion through Robot Operating System(ROS).The proposed LF-YOLO which utilizes continuous optimization strategy and residual technique provides a promising detector design,which has been found to be more effective during actual object detection,in terms of decreased average detection time by 68.25%and 60.60%,and increased average Intersection over Union(Io U)by 20.74%and6.79%compared to YOLOv3 and YOLOv4 through experiments.The comprehensive functional tests of RCRo system demonstrate the feasibility and competency of the multiple unmanned inspections in practice.展开更多
In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swa...In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs.展开更多
In recent years,the number of incidents involved with unmanned aerial vehicles(UAVs)has increased conspicuously,resulting in an increasingly urgent demand for developing anti-UAV systems. The vast requirements of high...In recent years,the number of incidents involved with unmanned aerial vehicles(UAVs)has increased conspicuously,resulting in an increasingly urgent demand for developing anti-UAV systems. The vast requirements of high detection accuracy with respect to low altitude UAVs are put forward. In addition,the methods of UAV detection based on deep learning are of great potential in low altitude UAV detection. However,such methods need high-quality datasets to cope with the problem of high false alarm rate(FAR)and high missing alarm rate(MAR)in low altitude UAV detection,special high-quality low altitude UAV detection dataset is still lacking. A handful of known datasets for UAV detection have been rejected by their proposers for authorization and are of poor quality. In this paper,a comprehensive enhanced dataset containing UAVs and jamming objects is proposed. A large number of high-definition UAV images are obtained through real world shooting, web crawler, and data enhancement.Moreover,to cope with the challenge of low altitude UAV detection in complex backgrounds and long distance,as well as the puzzle caused by jamming objects,the noise with jamming characteristics is added to the dataset. Finally,the dataset is trained,validated,and tested by four mainstream deep learning models. The results indicate that by using data enhancement,adding noise contained jamming objects and images of UAV with complex backgrounds and long distance,the accuracy of UAV detection can be significantly improved. This work will promote the development of anti-UAV systems deeply,and more convincing evaluation criteria are provided for models optimization for UAV detection.展开更多
This research presents an algorithm for face detection based on color images using three main components: skin color characteristics, hair color characteristics, and a decision structure which converts the obtained i...This research presents an algorithm for face detection based on color images using three main components: skin color characteristics, hair color characteristics, and a decision structure which converts the obtained information from skin and hair regions to labels for identifying the object dependencies and rejecting many of the incorrect decisions. Here we use face color characteristics that have a good resistance against the face rotations and expressions. This algorithm is also capable of being combined with other methods of face recognition in each stage to improve the detection.展开更多
Unexploded ordnance(UXO)poses a threat to soldiers operating in mission areas,but current UXO detection systems do not necessarily provide the required safety and efficiency to protect soldiers from this hazard.Recent...Unexploded ordnance(UXO)poses a threat to soldiers operating in mission areas,but current UXO detection systems do not necessarily provide the required safety and efficiency to protect soldiers from this hazard.Recent technological advancements in artificial intelligence(AI)and small unmanned aerial systems(sUAS)present an opportunity to explore a novel concept for UXO detection.The new UXO detection system proposed in this study takes advantage of employing an AI-trained multi-spectral(MS)sensor on sUAS.This paper explores feasibility of AI-based UXO detection using sUAS equipped with a single(visible)spectrum(SS)or MS digital electro-optical(EO)sensor.Specifically,it describes the design of the Deep Learning Convolutional Neural Network for UXO detection,the development of an AI-based algorithm for reliable UXO detection,and also provides a comparison of performance of the proposed system based on SS and MS sensor imagery.展开更多
Electricity plays a vital role in daily life and economic development.The status of the indicator lights of the power plant needs to be checked regularly to ensure the normal supply of electricity.Aiming at the proble...Electricity plays a vital role in daily life and economic development.The status of the indicator lights of the power plant needs to be checked regularly to ensure the normal supply of electricity.Aiming at the problem of a large amount of data and different sizes of indicator light detection,we propose an improved You Only Look Once vision 5(YOLOv5)power plant indicator light detection algorithm.The algorithm improves the feature extraction ability based on YOLOv5s.First,our algorithm enhances the ability of the network to perceive small objects by combining attention modules for multi-scale feature extraction.Second,we adjust the loss function to ensure the stability of the object frame during the regression process and improve the conver-gence accuracy.Finally,transfer learning is used to augment the dataset to improve the robustness of the algorithm.The experimental results show that the average accuracy of the proposed squeeze-and-excitation YOLOv5s(SE-YOLOv5s)algorithm is increased by 4.39%to 95.31%compared with the YOLOv5s algorithm.The proposed algorithm can better meet the engineering needs of power plant indicator light detection.展开更多
Light detection and ranging(LiDAR)sensors play a vital role in acquiring 3D point cloud data and extracting valuable information about objects for tasks such as autonomous driving,robotics,and virtual reality(VR).Howe...Light detection and ranging(LiDAR)sensors play a vital role in acquiring 3D point cloud data and extracting valuable information about objects for tasks such as autonomous driving,robotics,and virtual reality(VR).However,the sparse and disordered nature of the 3D point cloud poses significant challenges to feature extraction.Overcoming limitations is critical for 3D point cloud processing.3D point cloud object detection is a very challenging and crucial task,in which point cloud processing and feature extraction methods play a crucial role and have a significant impact on subsequent object detection performance.In this overview of outstanding work in object detection from the 3D point cloud,we specifically focus on summarizing methods employed in 3D point cloud processing.We introduce the way point clouds are processed in classical 3D object detection algorithms,and their improvements to solve the problems existing in point cloud processing.Different voxelization methods and point cloud sampling strategies will influence the extracted features,thereby impacting the final detection performance.展开更多
基金the China National Key Research and Development Program(Grant No.2016YFC0802904)National Natural Science Foundation of China(Grant No.61671470)62nd batch of funded projects of China Postdoctoral Science Foundation(Grant No.2017M623423)to provide fund for conducting experiments。
文摘Traditional object detectors based on deep learning rely on plenty of labeled samples,which are expensive to obtain.Few-shot object detection(FSOD)attempts to solve this problem,learning detection objects from a few labeled samples,but the performance is often unsatisfactory due to the scarcity of samples.We believe that the main reasons that restrict the performance of few-shot detectors are:(1)the positive samples is scarce,and(2)the quality of positive samples is low.Therefore,we put forward a novel few-shot object detector based on YOLOv4,starting from both improving the quantity and quality of positive samples.First,we design a hybrid multivariate positive sample augmentation(HMPSA)module to amplify the quantity of positive samples and increase positive sample diversity while suppressing negative samples.Then,we design a selective non-local fusion attention(SNFA)module to help the detector better learn the target features and improve the feature quality of positive samples.Finally,we optimize the loss function to make it more suitable for the task of FSOD.Experimental results on PASCAL VOC and MS COCO demonstrate that our designed few-shot object detector has competitive performance with other state-of-the-art detectors.
文摘In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have different orientations.Existing OBB object detection for remote sensing images,although making good progress,mainly focuses on directional modeling,while less consideration is given to the size of the object as well as the problem of missed detection.In this study,a method based on improved YOLOv8 was proposed for detecting oriented objects in remote sensing images,which can improve the detection precision of oriented objects in remote sensing images.Firstly,the ResCBAMG module was innovatively designed,which could better extract channel and spatial correlation information.Secondly,the innovative top-down feature fusion layer network structure was proposed in conjunction with the Efficient Channel Attention(ECA)attention module,which helped to capture inter-local cross-channel interaction information appropriately.Finally,we introduced an innovative ResCBAMG module between the different C2f modules and detection heads of the bottom-up feature fusion layer.This innovative structure helped the model to better focus on the target area.The precision and robustness of oriented target detection were also improved.Experimental results on the DOTA-v1.5 dataset showed that the detection Precision,mAP@0.5,and mAP@0.5:0.95 metrics of the improved model are better compared to the original model.This improvement is effective in detecting small targets and complex scenes.
基金the National Natural Science Foundation of China(No.61671470).
文摘A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on the static-world assumption.To cope with this challenging topic,a real-time and robust VSLAM system based on ORB-SLAM2 for dynamic environments was proposed.To reduce the influence of dynamic content,we incorporate the deep-learning-based object detection method in the visual odometry,then the dynamic object probability model is added to raise the efficiency of object detection deep neural network and enhance the real-time performance of our system.Experiment with both on the TUM and KITTI benchmark dataset,as well as in a real-world environment,the results clarify that our method can significantly reduce the tracking error or drift,enhance the robustness,accuracy and stability of the VSLAM system in dynamic scenes.
文摘Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate and the accuracy.A fast infrared small target detection method tailored for resource-constrained conditions is pro⁃posed for the YOLOv5s model.This method introduces an additional small target detection head and replaces the original Intersection over Union(IoU)metric with Normalized Wasserstein Distance(NWD),while considering both the detection accuracy and the detection speed of infrared small targets.Experimental results demonstrate that the proposed algorithm achieves a maximum effective detection speed of 95 FPS on a 15 W TPU,while reach⁃ing a maximum effective detection accuracy of 91.9 AP@0.5,effectively improving the efficiency of infrared small target detection under resource-constrained conditions.
基金funded by The National Natural Science Foundation of China(32271865)The Fundamental Research Funds for Central Universities(2572023CT16)the Fundamental Research Funds for Natural Science Foundation of Heilongjiang for Distinguished Young Scientists(JQ2023F002).
文摘Pine wood nematode infection is a devastating disease.Unmanned aerial vehicle(UAV)remote sensing enables timely and precise monitoring.However,UAV aerial images are challenged by small target size and complex sur-face backgrounds which hinder their effectiveness in moni-toring.To address these challenges,based on the analysis and optimization of UAV remote sensing images,this study developed a spatio-temporal multi-scale fusion algorithm for disease detection.The multi-head,self-attention mechanism is incorporated to address the issue of excessive features generated by complex surface backgrounds in UAV images.This enables adaptive feature control to suppress redundant information and boost the model’s feature extraction capa-bilities.The SPD-Conv module was introduced to address the problem of loss of small target feature information dur-ing feature extraction,enhancing the preservation of key features.Additionally,the gather-and-distribute mechanism was implemented to augment the model’s multi-scale feature fusion capacity,preventing the loss of local details during fusion and enriching small target feature information.This study established a dataset of pine wood nematode disease in the Huangshan area using DJI(DJ-Innovations)UAVs.The results show that the accuracy of the proposed model with spatio-temporal multi-scale fusion reached 78.5%,6.6%higher than that of the benchmark model.Building upon the timeliness and flexibility of UAV remote sensing,the pro-posed model effectively addressed the challenges of detect-ing small and medium-size targets in complex backgrounds,thereby enhancing the detection efficiency for pine wood nematode disease.This facilitates early preemptive preser-vation of diseased trees,augments the overall monitoring proficiency of pine wood nematode diseases,and supplies technical aid for proficient monitoring.
文摘The correct identification of traffic signs plays an important role in automatic driving technology and road safety driving.Therefore,to address the problems of misdetection and omission in traffic sign detection due to the variety of sign types,significant size differences and complex background information,an improved traffic sign detection model for RT-DETR was proposed in this study.Firstly,the HiLo attention mechanism was added to the Attention-based Intra-scale Feature Interaction,which further enhanced the feature extraction capability of the network and improved the detection efficiency on high-resolution images.Secondly,the CAFMFusion feature fusion mechanism was designed,which enabled the network to pay attention to the features in different regions in each channel.Based on this,the model could better capture the remote dependencies and neighborhood feature correlation,improving the feature fusion capability of the model.Finally,the MPDIoU was used as the loss function of the improved model to achieve faster convergence and more accurate regression results.The experimental results on the TT100k-2021 traffic sign dataset showed that the improved model achieves the performance with a precision value of 90.2%,recall value of 88.1%and mAP@0.5 value of 91.6%,which are 4.6%,5.8%,and 4.4%better than the original RT-DETR model respectively.The model effectively improves the problem of poor traffic sign detection and has greater practical value.
基金The work was supported by the National Key R&D Program of China(Grant No.2020YFC1511601)Fundamental Research Funds for the Central Universities(Grant No.2019SHFWLC01).
文摘Existing almost deep learning methods rely on a large amount of annotated data, so they are inappropriate for forest fire smoke detection with limited data. In this paper, a novel hybrid attention-based few-shot learning method, named Attention-Based Prototypical Network, is proposed for forest fire smoke detection. Specifically, feature extraction network, which consists of convolutional block attention module, could extract high-level and discriminative features and further decrease the false alarm rate resulting from suspected smoke areas. Moreover, we design a metalearning module to alleviate the overfitting issue caused by limited smoke images, and the meta-learning network enables achieving effective detection via comparing the distance between the class prototype of support images and the features of query images. A series of experiments on forest fire smoke datasets and miniImageNet dataset testify that the proposed method is superior to state-of-the-art few-shot learning approaches.
基金National Natural Science Foundation of China(grant number 61801512,grant number 62071484)Natural Science Foundation of Jiangsu Province(grant number BK20180080)to provide fund for conducting experiments。
文摘The object detectors can precisely detect the camouflaged object beyond human perception.The investigations reveal that the CNNs-based(Convolution Neural Networks)detectors are vulnerable to adversarial attacks.Some works can fool detectors by crafting the adversarial camouflage attached to the object,leading to wrong prediction.It is hard for military operations to utilize the existing adversarial camouflage due to its conspicuous appearance.Motivated by this,this paper proposes the Dual Attribute Adversarial Camouflage(DAAC)for evading the detection by both detectors and humans.Generating DAAC includes two steps:(1)Extracting features from a specific type of scene to generate individual soldier digital camouflage;(2)Attaching the adversarial patch with scene features constraint to the individual soldier digital camouflage to generate the adversarial attribute of DAAC.The visual effects of the individual soldier digital camouflage and the adversarial patch will be improved after integrating with the scene features.Experiment results show that objects camouflaged by DAAC are well integrated with background and achieve visual concealment while remaining effective in fooling object detectors,thus evading the detections by both detectors and humans in the digital domain.This work can serve as the reference for crafting the adversarial camouflage in the physical world.
基金the Science and Technology Innovation 2030-Key Projects(Nos.2018AAA0102303,2018AAA0102403)the Aeronautical Science Foundation of China(No.20175851033)the National Natural Science Foundation of China(Nos.U1913602,U19B2033,91648205,61803011).
文摘Inspired by eagle’s visual system,an eagle-vision-based object detection method for unmanned aerial vehicle(UAV)formation in hazy weather is proposed in this paper.To restore the hazy image,the values of atmospheric light and transmission are estimated on the basis of the signal processing mechanism of ON and OFF channels in eagle’s retina.Local features of the dehazed image are calculated according to the color antagonism mechanism and contrast sensitivity function of eagle’s visual system.A center-surround operation is performed to simulate the response of reception field.The final saliency map is generated by the Random Forest algorithm.Experimental results verify that the proposed method is capable to detect UAVs in hazy image and has superior performance over traditional methods.
基金supported by the Fundamental Research Funds for Central Universities of the Civil Aviation University of China(No.3122021088).
文摘The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,the detection accuracy of some categories in the apron dataset was low.Therefore,an improved object detection method using spatial-aware features in apron scenes called SA-FRCNN is presented.The method uses graph convolutional networks to capture the relative spatial relationship between objects in the apron scene,incorporating this spatial context into feature learning.Moreover,an attention mechanism is introduced into the feature extraction process,with the goal to focus on the spatial position and key features,and distance-IoU loss is used to achieve a more accurate regression.The experimental results show that the mean average precision of the apron object detection based on SAFRCNN can reach 95.75%,and the detection effect of some hard-to-detect categories has been significantly improved.The proposed method effectively improves the detection accuracy on the apron dataset,which has a leading advantage over other methods.
文摘A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.
基金supported in part by the“MOST”under Grant No.103-2221-E-216-012
文摘The region completeness of object detection is very crucial to video surveillance, such as the pedestrian and vehicle identifications. However, many conventional object detection approaches cannot guarantee the object region completeness because the object detection can be influenced by the illumination variations and clustering backgrounds. In order to overcome this problem, we propose the iterative superpixels grouping (ISPG) method to extract the precise object boundary and generate the object region with high completeness after the object detection. First, by extending the superpixel segmentation method, the proposed ISPG method can improve the inaccurate segmentation problem and guarantee the region completeness on the object regions. Second, the multi- resolution superpixel-based region completeness enhancement method is proposed to extract the object region with high precision and completeness. The simulation results show that the proposed method outperforms the conventional object detection methods in terms of object completeness evaluation.
基金supported by the National Natural Science Foundation of China(No.52174021)Key Research and Develop-ment Project of Hainan Province(No.ZDYF2022GXJS 003).
文摘The integrity and fineness characterization of non-connected regions and contours is a major challenge for existing salient object detection.The key to address is how to make full use of the subjective and objective structural information obtained in different steps.Therefore,by simulating the human visual mechanism,this paper proposes a novel multi-decoder matching correction network and subjective structural loss.Specifically,the loss pays different attentions to the foreground,boundary,and background of ground truth map in a top-down structure.And the perceived saliency is mapped to the corresponding objective structure of the prediction map,which is extracted in a bottom-up manner.Thus,multi-level salient features can be effectively detected with the loss as constraint.And then,through the mapping of improved binary cross entropy loss,the differences between salient regions and objects are checked to pay attention to the error prone region to achieve excellent error sensitivity.Finally,through tracking the identifying feature horizontally and vertically,the subjective and objective interaction is maximized.Extensive experiments on five benchmark datasets demonstrate that compared with 12 state-of-the-art methods,the algorithm has higher recall and precision,less error and strong robustness and generalization ability,and can predict complete and refined saliency maps.
基金supported by the National Key Research and Development Program of China(grant number:2017YFC0806503)。
文摘With the increasing number of vehicles,manual security inspections are becoming more laborious at road checkpoints.To address it,a specialized Road Checkpoints Robot(RCRo)system is proposed,incorporated with enhanced You Only Look Once(YOLO)and a 6-degree-of-freedom(DOF)manipulator,for autonomous identity verification and vehicle inspection.The modified YOLO is characterized by large objects’sensitivity and faster detection speed,named“LF-YOLO”.The better sensitivity of large objects and the faster detection speed are achieved by means of the Dense module-based backbone network connecting two-scale detecting network,for object detection tasks,along with optimized anchor boxes and improved loss function.During the manipulator motion,Octree-aided motion control scheme is adopted for collision-free motion through Robot Operating System(ROS).The proposed LF-YOLO which utilizes continuous optimization strategy and residual technique provides a promising detector design,which has been found to be more effective during actual object detection,in terms of decreased average detection time by 68.25%and 60.60%,and increased average Intersection over Union(Io U)by 20.74%and6.79%compared to YOLOv3 and YOLOv4 through experiments.The comprehensive functional tests of RCRo system demonstrate the feasibility and competency of the multiple unmanned inspections in practice.
文摘In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs.
基金supported by the National Natural Science Foundation of China(No. 62173237)the National Key R&D Program of China(No.2018AAA0100804)+7 种基金the Zhejiang Key laboratory of General Aviation Operation technology(No.JDGA2020-7)the Talent Project of Revitalization Liaoning(No. XLYC1907022)the Key R & D Projects of Liaoning Province (No. 2020JH2/10100045)the Natural Science Foundation of Liaoning Province(No. 2019-MS-251)the Scientific Research Project of Liaoning Provincial Department of Education(No.JYT2020142)the High-Level Innovation Talent Project of Shenyang (No.RC190030)the Science and Technology Project of Beijing Municipal Commission of Education (No. KM201811417005)the Academic Research Projects of Beijing Union University(No.ZB10202005)。
文摘In recent years,the number of incidents involved with unmanned aerial vehicles(UAVs)has increased conspicuously,resulting in an increasingly urgent demand for developing anti-UAV systems. The vast requirements of high detection accuracy with respect to low altitude UAVs are put forward. In addition,the methods of UAV detection based on deep learning are of great potential in low altitude UAV detection. However,such methods need high-quality datasets to cope with the problem of high false alarm rate(FAR)and high missing alarm rate(MAR)in low altitude UAV detection,special high-quality low altitude UAV detection dataset is still lacking. A handful of known datasets for UAV detection have been rejected by their proposers for authorization and are of poor quality. In this paper,a comprehensive enhanced dataset containing UAVs and jamming objects is proposed. A large number of high-definition UAV images are obtained through real world shooting, web crawler, and data enhancement.Moreover,to cope with the challenge of low altitude UAV detection in complex backgrounds and long distance,as well as the puzzle caused by jamming objects,the noise with jamming characteristics is added to the dataset. Finally,the dataset is trained,validated,and tested by four mainstream deep learning models. The results indicate that by using data enhancement,adding noise contained jamming objects and images of UAV with complex backgrounds and long distance,the accuracy of UAV detection can be significantly improved. This work will promote the development of anti-UAV systems deeply,and more convincing evaluation criteria are provided for models optimization for UAV detection.
文摘This research presents an algorithm for face detection based on color images using three main components: skin color characteristics, hair color characteristics, and a decision structure which converts the obtained information from skin and hair regions to labels for identifying the object dependencies and rejecting many of the incorrect decisions. Here we use face color characteristics that have a good resistance against the face rotations and expressions. This algorithm is also capable of being combined with other methods of face recognition in each stage to improve the detection.
基金the Office of Naval Research for supporting this effort through the Consortium for Robotics and Unmanned Systems Education and Research。
文摘Unexploded ordnance(UXO)poses a threat to soldiers operating in mission areas,but current UXO detection systems do not necessarily provide the required safety and efficiency to protect soldiers from this hazard.Recent technological advancements in artificial intelligence(AI)and small unmanned aerial systems(sUAS)present an opportunity to explore a novel concept for UXO detection.The new UXO detection system proposed in this study takes advantage of employing an AI-trained multi-spectral(MS)sensor on sUAS.This paper explores feasibility of AI-based UXO detection using sUAS equipped with a single(visible)spectrum(SS)or MS digital electro-optical(EO)sensor.Specifically,it describes the design of the Deep Learning Convolutional Neural Network for UXO detection,the development of an AI-based algorithm for reliable UXO detection,and also provides a comparison of performance of the proposed system based on SS and MS sensor imagery.
基金supported by the National Natural Science Foun-dation of China(Nos.61702347,62027801)the Natural Sci-ence Foundation of Hebei Province(Nos.F2022210007,F2017210161)+1 种基金the Science and Technology Project of Hebei Education Department(Nos.ZD2022100,QN2017132)the Central Guidance on Local Science and Technology Development Fund(No.226Z0501G)。
文摘Electricity plays a vital role in daily life and economic development.The status of the indicator lights of the power plant needs to be checked regularly to ensure the normal supply of electricity.Aiming at the problem of a large amount of data and different sizes of indicator light detection,we propose an improved You Only Look Once vision 5(YOLOv5)power plant indicator light detection algorithm.The algorithm improves the feature extraction ability based on YOLOv5s.First,our algorithm enhances the ability of the network to perceive small objects by combining attention modules for multi-scale feature extraction.Second,we adjust the loss function to ensure the stability of the object frame during the regression process and improve the conver-gence accuracy.Finally,transfer learning is used to augment the dataset to improve the robustness of the algorithm.The experimental results show that the average accuracy of the proposed squeeze-and-excitation YOLOv5s(SE-YOLOv5s)algorithm is increased by 4.39%to 95.31%compared with the YOLOv5s algorithm.The proposed algorithm can better meet the engineering needs of power plant indicator light detection.
文摘Light detection and ranging(LiDAR)sensors play a vital role in acquiring 3D point cloud data and extracting valuable information about objects for tasks such as autonomous driving,robotics,and virtual reality(VR).However,the sparse and disordered nature of the 3D point cloud poses significant challenges to feature extraction.Overcoming limitations is critical for 3D point cloud processing.3D point cloud object detection is a very challenging and crucial task,in which point cloud processing and feature extraction methods play a crucial role and have a significant impact on subsequent object detection performance.In this overview of outstanding work in object detection from the 3D point cloud,we specifically focus on summarizing methods employed in 3D point cloud processing.We introduce the way point clouds are processed in classical 3D object detection algorithms,and their improvements to solve the problems existing in point cloud processing.Different voxelization methods and point cloud sampling strategies will influence the extracted features,thereby impacting the final detection performance.