Top-view fisheye cameras are widely used in personnel surveillance for their broad field of view,but their unique imaging characteristics pose challenges like distortion,complex scenes,scale variations,and small objec...Top-view fisheye cameras are widely used in personnel surveillance for their broad field of view,but their unique imaging characteristics pose challenges like distortion,complex scenes,scale variations,and small objects near image edges.To tackle these,we proposed peripheral focus you only look once(PF-YOLO),an enhanced YOLOv8n-based method.Firstly,we introduced a cutting-patch data augmentation strategy to mitigate the problem of insufficient small-object samples in various scenes.Secondly,to enhance the model's focus on small objects near the edges,we designed the peripheral focus loss,which uses dynamic focus coefficients to provide greater gradient gains for these objects,improving their regression accuracy.Finally,we designed the three dimensional(3D)spatial-channel coordinate attention C2f module,enhancing spatial and channel perception,suppressing noise,and improving personnel detection.Experimental results demonstrate that PF-YOLO achieves strong performance on the challenging events for person detection from overhead fisheye images(CEPDTOF)and in-the-wild events for people detection and tracking from overhead fisheye cameras(WEPDTOF)datasets.Compared to the original YOLOv8n model,PFYOLO achieves improvements on CEPDTOF with increases of 2.1%,1.7%and 2.9%in mean average precision 50(mAP 50),mAP 50-95,and tively.On WEPDTOF,PF-YOLO achieves substantial improvements with increases of 31.4%,14.9%,61.1%and 21.0%in 91.2%and 57.2%,respectively.展开更多
Pine wood nematode infection is a devastating disease.Unmanned aerial vehicle(UAV)remote sensing enables timely and precise monitoring.However,UAV aerial images are challenged by small target size and complex sur-face...Pine wood nematode infection is a devastating disease.Unmanned aerial vehicle(UAV)remote sensing enables timely and precise monitoring.However,UAV aerial images are challenged by small target size and complex sur-face backgrounds which hinder their effectiveness in moni-toring.To address these challenges,based on the analysis and optimization of UAV remote sensing images,this study developed a spatio-temporal multi-scale fusion algorithm for disease detection.The multi-head,self-attention mechanism is incorporated to address the issue of excessive features generated by complex surface backgrounds in UAV images.This enables adaptive feature control to suppress redundant information and boost the model’s feature extraction capa-bilities.The SPD-Conv module was introduced to address the problem of loss of small target feature information dur-ing feature extraction,enhancing the preservation of key features.Additionally,the gather-and-distribute mechanism was implemented to augment the model’s multi-scale feature fusion capacity,preventing the loss of local details during fusion and enriching small target feature information.This study established a dataset of pine wood nematode disease in the Huangshan area using DJI(DJ-Innovations)UAVs.The results show that the accuracy of the proposed model with spatio-temporal multi-scale fusion reached 78.5%,6.6%higher than that of the benchmark model.Building upon the timeliness and flexibility of UAV remote sensing,the pro-posed model effectively addressed the challenges of detect-ing small and medium-size targets in complex backgrounds,thereby enhancing the detection efficiency for pine wood nematode disease.This facilitates early preemptive preser-vation of diseased trees,augments the overall monitoring proficiency of pine wood nematode diseases,and supplies technical aid for proficient monitoring.展开更多
In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have differ...In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have different orientations.Existing OBB object detection for remote sensing images,although making good progress,mainly focuses on directional modeling,while less consideration is given to the size of the object as well as the problem of missed detection.In this study,a method based on improved YOLOv8 was proposed for detecting oriented objects in remote sensing images,which can improve the detection precision of oriented objects in remote sensing images.Firstly,the ResCBAMG module was innovatively designed,which could better extract channel and spatial correlation information.Secondly,the innovative top-down feature fusion layer network structure was proposed in conjunction with the Efficient Channel Attention(ECA)attention module,which helped to capture inter-local cross-channel interaction information appropriately.Finally,we introduced an innovative ResCBAMG module between the different C2f modules and detection heads of the bottom-up feature fusion layer.This innovative structure helped the model to better focus on the target area.The precision and robustness of oriented target detection were also improved.Experimental results on the DOTA-v1.5 dataset showed that the detection Precision,mAP@0.5,and mAP@0.5:0.95 metrics of the improved model are better compared to the original model.This improvement is effective in detecting small targets and complex scenes.展开更多
Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firs...Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firstly,considering the large-scale variation and camouflage of armored target,a new MS-RN integrating contextual information in battlefield environment is designed.The MS-RN extracts deep features from templates with different scales and strengthens the detection ability of small targets.Armored targets of different sizes are detected on different representation features.Secondly,aiming at the accuracy and real-time detection requirements,improved shape-fixed Guided Anchor is used on feature maps of different scales to recommend regions of interests(ROIs).Different from sliding or random anchor,the SF-GA can filter out 80% of the regions while still improving the recall.A special detection dataset for armored target,named Armored Target Dataset(ARTD),is constructed,based on which the comparable experiments with state-of-art detection methods are conducted.Experimental results show that the proposed method achieves outstanding performance in detection accuracy and efficiency,especially when small armored targets are involved.展开更多
This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a mult...This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a multi-scale quality manner.In the proposed approach,the reference data composed of several parts and each part is protected by a channel coding rate according to its importance.The first part,which is used to reconstruct a rough approximation of the original image,is highly protected in order to resist against higher tampering rates.Other parts are protected with lower rates according to their importance leading to lower tolerable tampering rate(TTR),but the higher quality of the recovered images.The proposed MSSR approach is an efficient solution for the main disadvantage of the current methods,which either recover a tampered image in low tampering rates or fails when tampering rate is above the TTR value.The simulation results on 10000 test images represent the efficiency of the multi-scale self-recovery feature of the proposed approach in comparison with the existing methods.展开更多
Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate ...Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate and the accuracy.A fast infrared small target detection method tailored for resource-constrained conditions is pro⁃posed for the YOLOv5s model.This method introduces an additional small target detection head and replaces the original Intersection over Union(IoU)metric with Normalized Wasserstein Distance(NWD),while considering both the detection accuracy and the detection speed of infrared small targets.Experimental results demonstrate that the proposed algorithm achieves a maximum effective detection speed of 95 FPS on a 15 W TPU,while reach⁃ing a maximum effective detection accuracy of 91.9 AP@0.5,effectively improving the efficiency of infrared small target detection under resource-constrained conditions.展开更多
A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on...A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on the static-world assumption.To cope with this challenging topic,a real-time and robust VSLAM system based on ORB-SLAM2 for dynamic environments was proposed.To reduce the influence of dynamic content,we incorporate the deep-learning-based object detection method in the visual odometry,then the dynamic object probability model is added to raise the efficiency of object detection deep neural network and enhance the real-time performance of our system.Experiment with both on the TUM and KITTI benchmark dataset,as well as in a real-world environment,the results clarify that our method can significantly reduce the tracking error or drift,enhance the robustness,accuracy and stability of the VSLAM system in dynamic scenes.展开更多
The object detectors can precisely detect the camouflaged object beyond human perception.The investigations reveal that the CNNs-based(Convolution Neural Networks)detectors are vulnerable to adversarial attacks.Some w...The object detectors can precisely detect the camouflaged object beyond human perception.The investigations reveal that the CNNs-based(Convolution Neural Networks)detectors are vulnerable to adversarial attacks.Some works can fool detectors by crafting the adversarial camouflage attached to the object,leading to wrong prediction.It is hard for military operations to utilize the existing adversarial camouflage due to its conspicuous appearance.Motivated by this,this paper proposes the Dual Attribute Adversarial Camouflage(DAAC)for evading the detection by both detectors and humans.Generating DAAC includes two steps:(1)Extracting features from a specific type of scene to generate individual soldier digital camouflage;(2)Attaching the adversarial patch with scene features constraint to the individual soldier digital camouflage to generate the adversarial attribute of DAAC.The visual effects of the individual soldier digital camouflage and the adversarial patch will be improved after integrating with the scene features.Experiment results show that objects camouflaged by DAAC are well integrated with background and achieve visual concealment while remaining effective in fooling object detectors,thus evading the detections by both detectors and humans in the digital domain.This work can serve as the reference for crafting the adversarial camouflage in the physical world.展开更多
In response to the challenge of low detection accuracy and susceptibility to missed and false detections of small targets in unmanned aerial vehicles(UAVs)aerial images,an improved UAV image target detection algorithm...In response to the challenge of low detection accuracy and susceptibility to missed and false detections of small targets in unmanned aerial vehicles(UAVs)aerial images,an improved UAV image target detection algorithm based on YOLOv8 was proposed in this study.To begin with,the CoordAtt attention mechanism was employed to enhance the feature extraction capability of the backbone network,thereby reducing interference from backgrounds.Additionally,the BiFPN feature fusion network with an added small object detection layer was used to enhance the model's ability to perceive for small objects.Furthermore,a multi-level fusion module was designed and proposed to effectively integrate shallow and deep information.The use of an enhanced MPDIoU loss function further improved detection performance.The experimental results based on the publicly available VisDrone2019 dataset showed that the improved model outperformed the YOLOv8 baseline model,mAP@0.5 improved by 20%,and the improved method improved the detection accuracy of the model for small targets.展开更多
Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditi...Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditional cleanup methods and the challenges in detecting small targets,an improved YOLOv5 object detection model was proposed in this study.In order to enhance the model’s sensitivity to small targets and mitigate the impact of redundant information on detection performance,a bi-level routing attention mechanism was introduced and embedded into the backbone network.Additionally,a multi-scale detection head was incorporated into the model,allowing for more comprehensive coverage of floating garbage of various sizes through multi-scale feature extraction and detection.The Focal-EIoU loss function was also employed to optimize the model parameters,improving localization accuracy.Experimental results on the publicly available FloW_Img dataset demonstrated that the improved YOLOv5 model outperforms the original YOLOv5 model in terms of precision and recall,achieving a mAP(mean average precision)of 86.12%,with significant improvements and faster convergence.展开更多
The correct identification of traffic signs plays an important role in automatic driving technology and road safety driving.Therefore,to address the problems of misdetection and omission in traffic sign detection due ...The correct identification of traffic signs plays an important role in automatic driving technology and road safety driving.Therefore,to address the problems of misdetection and omission in traffic sign detection due to the variety of sign types,significant size differences and complex background information,an improved traffic sign detection model for RT-DETR was proposed in this study.Firstly,the HiLo attention mechanism was added to the Attention-based Intra-scale Feature Interaction,which further enhanced the feature extraction capability of the network and improved the detection efficiency on high-resolution images.Secondly,the CAFMFusion feature fusion mechanism was designed,which enabled the network to pay attention to the features in different regions in each channel.Based on this,the model could better capture the remote dependencies and neighborhood feature correlation,improving the feature fusion capability of the model.Finally,the MPDIoU was used as the loss function of the improved model to achieve faster convergence and more accurate regression results.The experimental results on the TT100k-2021 traffic sign dataset showed that the improved model achieves the performance with a precision value of 90.2%,recall value of 88.1%and mAP@0.5 value of 91.6%,which are 4.6%,5.8%,and 4.4%better than the original RT-DETR model respectively.The model effectively improves the problem of poor traffic sign detection and has greater practical value.展开更多
Inspired by eagle’s visual system,an eagle-vision-based object detection method for unmanned aerial vehicle(UAV)formation in hazy weather is proposed in this paper.To restore the hazy image,the values of atmospheric ...Inspired by eagle’s visual system,an eagle-vision-based object detection method for unmanned aerial vehicle(UAV)formation in hazy weather is proposed in this paper.To restore the hazy image,the values of atmospheric light and transmission are estimated on the basis of the signal processing mechanism of ON and OFF channels in eagle’s retina.Local features of the dehazed image are calculated according to the color antagonism mechanism and contrast sensitivity function of eagle’s visual system.A center-surround operation is performed to simulate the response of reception field.The final saliency map is generated by the Random Forest algorithm.Experimental results verify that the proposed method is capable to detect UAVs in hazy image and has superior performance over traditional methods.展开更多
The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,th...The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,the detection accuracy of some categories in the apron dataset was low.Therefore,an improved object detection method using spatial-aware features in apron scenes called SA-FRCNN is presented.The method uses graph convolutional networks to capture the relative spatial relationship between objects in the apron scene,incorporating this spatial context into feature learning.Moreover,an attention mechanism is introduced into the feature extraction process,with the goal to focus on the spatial position and key features,and distance-IoU loss is used to achieve a more accurate regression.The experimental results show that the mean average precision of the apron object detection based on SAFRCNN can reach 95.75%,and the detection effect of some hard-to-detect categories has been significantly improved.The proposed method effectively improves the detection accuracy on the apron dataset,which has a leading advantage over other methods.展开更多
This study presents an innovative approach to improving the performance of YOLO-v8 model for small object detection in radar images.Initially,a local histogram equalization technique was applied to the original images...This study presents an innovative approach to improving the performance of YOLO-v8 model for small object detection in radar images.Initially,a local histogram equalization technique was applied to the original images,resulting in a notable enhancement in both contrast and detail representation.Subsequently,the YOLO-v8 backbone network was augmented by incorporating convolutional kernels based on a multidimensional attention mechanism and a parallel processing strategy,which facilitated more effective feature information fusion.At the model’s head,an upsampling layer was added,along with the fusion of outputs from the shallow network,and a detection head specifically tailored for small object detection,thereby further improving accuracy.Additionally,the loss function was modified to incorporate focal-intersection over union(IoU)in conjunction with scaled-IoU,which enhanced the model’s performance.A weighting strategy was also introduced,effectively improving detection accuracy for small targets.Experimental results demonstrate that the customized model outperforms traditional approaches across various evaluation metrics,including recall,precision,F1-score,and the receiver operating characteristic(ROC)curve,validating its efficacy and innovation in small object detection within radar imagery.The results indicate a substantial improvement in accuracy compared to conventional methods such as image segmentation and standard convolutional neural networks.展开更多
Traditional object detectors based on deep learning rely on plenty of labeled samples,which are expensive to obtain.Few-shot object detection(FSOD)attempts to solve this problem,learning detection objects from a few l...Traditional object detectors based on deep learning rely on plenty of labeled samples,which are expensive to obtain.Few-shot object detection(FSOD)attempts to solve this problem,learning detection objects from a few labeled samples,but the performance is often unsatisfactory due to the scarcity of samples.We believe that the main reasons that restrict the performance of few-shot detectors are:(1)the positive samples is scarce,and(2)the quality of positive samples is low.Therefore,we put forward a novel few-shot object detector based on YOLOv4,starting from both improving the quantity and quality of positive samples.First,we design a hybrid multivariate positive sample augmentation(HMPSA)module to amplify the quantity of positive samples and increase positive sample diversity while suppressing negative samples.Then,we design a selective non-local fusion attention(SNFA)module to help the detector better learn the target features and improve the feature quality of positive samples.Finally,we optimize the loss function to make it more suitable for the task of FSOD.Experimental results on PASCAL VOC and MS COCO demonstrate that our designed few-shot object detector has competitive performance with other state-of-the-art detectors.展开更多
A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence...A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.展开更多
The region completeness of object detection is very crucial to video surveillance, such as the pedestrian and vehicle identifications. However, many conventional object detection approaches cannot guarantee the object...The region completeness of object detection is very crucial to video surveillance, such as the pedestrian and vehicle identifications. However, many conventional object detection approaches cannot guarantee the object region completeness because the object detection can be influenced by the illumination variations and clustering backgrounds. In order to overcome this problem, we propose the iterative superpixels grouping (ISPG) method to extract the precise object boundary and generate the object region with high completeness after the object detection. First, by extending the superpixel segmentation method, the proposed ISPG method can improve the inaccurate segmentation problem and guarantee the region completeness on the object regions. Second, the multi- resolution superpixel-based region completeness enhancement method is proposed to extract the object region with high precision and completeness. The simulation results show that the proposed method outperforms the conventional object detection methods in terms of object completeness evaluation.展开更多
The integrity and fineness characterization of non-connected regions and contours is a major challenge for existing salient object detection.The key to address is how to make full use of the subjective and objective s...The integrity and fineness characterization of non-connected regions and contours is a major challenge for existing salient object detection.The key to address is how to make full use of the subjective and objective structural information obtained in different steps.Therefore,by simulating the human visual mechanism,this paper proposes a novel multi-decoder matching correction network and subjective structural loss.Specifically,the loss pays different attentions to the foreground,boundary,and background of ground truth map in a top-down structure.And the perceived saliency is mapped to the corresponding objective structure of the prediction map,which is extracted in a bottom-up manner.Thus,multi-level salient features can be effectively detected with the loss as constraint.And then,through the mapping of improved binary cross entropy loss,the differences between salient regions and objects are checked to pay attention to the error prone region to achieve excellent error sensitivity.Finally,through tracking the identifying feature horizontally and vertically,the subjective and objective interaction is maximized.Extensive experiments on five benchmark datasets demonstrate that compared with 12 state-of-the-art methods,the algorithm has higher recall and precision,less error and strong robustness and generalization ability,and can predict complete and refined saliency maps.展开更多
In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swa...In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs.展开更多
With the increasing number of vehicles,manual security inspections are becoming more laborious at road checkpoints.To address it,a specialized Road Checkpoints Robot(RCRo)system is proposed,incorporated with enhanced ...With the increasing number of vehicles,manual security inspections are becoming more laborious at road checkpoints.To address it,a specialized Road Checkpoints Robot(RCRo)system is proposed,incorporated with enhanced You Only Look Once(YOLO)and a 6-degree-of-freedom(DOF)manipulator,for autonomous identity verification and vehicle inspection.The modified YOLO is characterized by large objects’sensitivity and faster detection speed,named“LF-YOLO”.The better sensitivity of large objects and the faster detection speed are achieved by means of the Dense module-based backbone network connecting two-scale detecting network,for object detection tasks,along with optimized anchor boxes and improved loss function.During the manipulator motion,Octree-aided motion control scheme is adopted for collision-free motion through Robot Operating System(ROS).The proposed LF-YOLO which utilizes continuous optimization strategy and residual technique provides a promising detector design,which has been found to be more effective during actual object detection,in terms of decreased average detection time by 68.25%and 60.60%,and increased average Intersection over Union(Io U)by 20.74%and6.79%compared to YOLOv3 and YOLOv4 through experiments.The comprehensive functional tests of RCRo system demonstrate the feasibility and competency of the multiple unmanned inspections in practice.展开更多
基金supported by National Natural Science Foundation of China(Nos.62171042,62102033,U24A20331)the R&D Program of Beijing Municipal Education Commission(No.KZ202211417048)+2 种基金the Project of Construction and Support for High-Level Innovative Teams of Beijing Municipal Institutions(No.BPHR20220121)Beijing Natural Science Foundation(Nos.4232026,4242020)the Academic Research Projects of Beijing Union University(Nos.ZKZD202302,ZK20202403)。
文摘Top-view fisheye cameras are widely used in personnel surveillance for their broad field of view,but their unique imaging characteristics pose challenges like distortion,complex scenes,scale variations,and small objects near image edges.To tackle these,we proposed peripheral focus you only look once(PF-YOLO),an enhanced YOLOv8n-based method.Firstly,we introduced a cutting-patch data augmentation strategy to mitigate the problem of insufficient small-object samples in various scenes.Secondly,to enhance the model's focus on small objects near the edges,we designed the peripheral focus loss,which uses dynamic focus coefficients to provide greater gradient gains for these objects,improving their regression accuracy.Finally,we designed the three dimensional(3D)spatial-channel coordinate attention C2f module,enhancing spatial and channel perception,suppressing noise,and improving personnel detection.Experimental results demonstrate that PF-YOLO achieves strong performance on the challenging events for person detection from overhead fisheye images(CEPDTOF)and in-the-wild events for people detection and tracking from overhead fisheye cameras(WEPDTOF)datasets.Compared to the original YOLOv8n model,PFYOLO achieves improvements on CEPDTOF with increases of 2.1%,1.7%and 2.9%in mean average precision 50(mAP 50),mAP 50-95,and tively.On WEPDTOF,PF-YOLO achieves substantial improvements with increases of 31.4%,14.9%,61.1%and 21.0%in 91.2%and 57.2%,respectively.
基金funded by The National Natural Science Foundation of China(32271865)The Fundamental Research Funds for Central Universities(2572023CT16)the Fundamental Research Funds for Natural Science Foundation of Heilongjiang for Distinguished Young Scientists(JQ2023F002).
文摘Pine wood nematode infection is a devastating disease.Unmanned aerial vehicle(UAV)remote sensing enables timely and precise monitoring.However,UAV aerial images are challenged by small target size and complex sur-face backgrounds which hinder their effectiveness in moni-toring.To address these challenges,based on the analysis and optimization of UAV remote sensing images,this study developed a spatio-temporal multi-scale fusion algorithm for disease detection.The multi-head,self-attention mechanism is incorporated to address the issue of excessive features generated by complex surface backgrounds in UAV images.This enables adaptive feature control to suppress redundant information and boost the model’s feature extraction capa-bilities.The SPD-Conv module was introduced to address the problem of loss of small target feature information dur-ing feature extraction,enhancing the preservation of key features.Additionally,the gather-and-distribute mechanism was implemented to augment the model’s multi-scale feature fusion capacity,preventing the loss of local details during fusion and enriching small target feature information.This study established a dataset of pine wood nematode disease in the Huangshan area using DJI(DJ-Innovations)UAVs.The results show that the accuracy of the proposed model with spatio-temporal multi-scale fusion reached 78.5%,6.6%higher than that of the benchmark model.Building upon the timeliness and flexibility of UAV remote sensing,the pro-posed model effectively addressed the challenges of detect-ing small and medium-size targets in complex backgrounds,thereby enhancing the detection efficiency for pine wood nematode disease.This facilitates early preemptive preser-vation of diseased trees,augments the overall monitoring proficiency of pine wood nematode diseases,and supplies technical aid for proficient monitoring.
文摘In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have different orientations.Existing OBB object detection for remote sensing images,although making good progress,mainly focuses on directional modeling,while less consideration is given to the size of the object as well as the problem of missed detection.In this study,a method based on improved YOLOv8 was proposed for detecting oriented objects in remote sensing images,which can improve the detection precision of oriented objects in remote sensing images.Firstly,the ResCBAMG module was innovatively designed,which could better extract channel and spatial correlation information.Secondly,the innovative top-down feature fusion layer network structure was proposed in conjunction with the Efficient Channel Attention(ECA)attention module,which helped to capture inter-local cross-channel interaction information appropriately.Finally,we introduced an innovative ResCBAMG module between the different C2f modules and detection heads of the bottom-up feature fusion layer.This innovative structure helped the model to better focus on the target area.The precision and robustness of oriented target detection were also improved.Experimental results on the DOTA-v1.5 dataset showed that the detection Precision,mAP@0.5,and mAP@0.5:0.95 metrics of the improved model are better compared to the original model.This improvement is effective in detecting small targets and complex scenes.
基金supported by the National Key Research and Development Program of China under grant 2016YFC0802904National Natural Science Foundation of China under grant61671470the Postdoctoral Science Foundation Funded Project of China under grant 2017M623423。
文摘Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firstly,considering the large-scale variation and camouflage of armored target,a new MS-RN integrating contextual information in battlefield environment is designed.The MS-RN extracts deep features from templates with different scales and strengthens the detection ability of small targets.Armored targets of different sizes are detected on different representation features.Secondly,aiming at the accuracy and real-time detection requirements,improved shape-fixed Guided Anchor is used on feature maps of different scales to recommend regions of interests(ROIs).Different from sliding or random anchor,the SF-GA can filter out 80% of the regions while still improving the recall.A special detection dataset for armored target,named Armored Target Dataset(ARTD),is constructed,based on which the comparable experiments with state-of-art detection methods are conducted.Experimental results show that the proposed method achieves outstanding performance in detection accuracy and efficiency,especially when small armored targets are involved.
文摘This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a multi-scale quality manner.In the proposed approach,the reference data composed of several parts and each part is protected by a channel coding rate according to its importance.The first part,which is used to reconstruct a rough approximation of the original image,is highly protected in order to resist against higher tampering rates.Other parts are protected with lower rates according to their importance leading to lower tolerable tampering rate(TTR),but the higher quality of the recovered images.The proposed MSSR approach is an efficient solution for the main disadvantage of the current methods,which either recover a tampered image in low tampering rates or fails when tampering rate is above the TTR value.The simulation results on 10000 test images represent the efficiency of the multi-scale self-recovery feature of the proposed approach in comparison with the existing methods.
文摘Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate and the accuracy.A fast infrared small target detection method tailored for resource-constrained conditions is pro⁃posed for the YOLOv5s model.This method introduces an additional small target detection head and replaces the original Intersection over Union(IoU)metric with Normalized Wasserstein Distance(NWD),while considering both the detection accuracy and the detection speed of infrared small targets.Experimental results demonstrate that the proposed algorithm achieves a maximum effective detection speed of 95 FPS on a 15 W TPU,while reach⁃ing a maximum effective detection accuracy of 91.9 AP@0.5,effectively improving the efficiency of infrared small target detection under resource-constrained conditions.
基金the National Natural Science Foundation of China(No.61671470).
文摘A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on the static-world assumption.To cope with this challenging topic,a real-time and robust VSLAM system based on ORB-SLAM2 for dynamic environments was proposed.To reduce the influence of dynamic content,we incorporate the deep-learning-based object detection method in the visual odometry,then the dynamic object probability model is added to raise the efficiency of object detection deep neural network and enhance the real-time performance of our system.Experiment with both on the TUM and KITTI benchmark dataset,as well as in a real-world environment,the results clarify that our method can significantly reduce the tracking error or drift,enhance the robustness,accuracy and stability of the VSLAM system in dynamic scenes.
基金National Natural Science Foundation of China(grant number 61801512,grant number 62071484)Natural Science Foundation of Jiangsu Province(grant number BK20180080)to provide fund for conducting experiments。
文摘The object detectors can precisely detect the camouflaged object beyond human perception.The investigations reveal that the CNNs-based(Convolution Neural Networks)detectors are vulnerable to adversarial attacks.Some works can fool detectors by crafting the adversarial camouflage attached to the object,leading to wrong prediction.It is hard for military operations to utilize the existing adversarial camouflage due to its conspicuous appearance.Motivated by this,this paper proposes the Dual Attribute Adversarial Camouflage(DAAC)for evading the detection by both detectors and humans.Generating DAAC includes two steps:(1)Extracting features from a specific type of scene to generate individual soldier digital camouflage;(2)Attaching the adversarial patch with scene features constraint to the individual soldier digital camouflage to generate the adversarial attribute of DAAC.The visual effects of the individual soldier digital camouflage and the adversarial patch will be improved after integrating with the scene features.Experiment results show that objects camouflaged by DAAC are well integrated with background and achieve visual concealment while remaining effective in fooling object detectors,thus evading the detections by both detectors and humans in the digital domain.This work can serve as the reference for crafting the adversarial camouflage in the physical world.
文摘In response to the challenge of low detection accuracy and susceptibility to missed and false detections of small targets in unmanned aerial vehicles(UAVs)aerial images,an improved UAV image target detection algorithm based on YOLOv8 was proposed in this study.To begin with,the CoordAtt attention mechanism was employed to enhance the feature extraction capability of the backbone network,thereby reducing interference from backgrounds.Additionally,the BiFPN feature fusion network with an added small object detection layer was used to enhance the model's ability to perceive for small objects.Furthermore,a multi-level fusion module was designed and proposed to effectively integrate shallow and deep information.The use of an enhanced MPDIoU loss function further improved detection performance.The experimental results based on the publicly available VisDrone2019 dataset showed that the improved model outperformed the YOLOv8 baseline model,mAP@0.5 improved by 20%,and the improved method improved the detection accuracy of the model for small targets.
文摘Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditional cleanup methods and the challenges in detecting small targets,an improved YOLOv5 object detection model was proposed in this study.In order to enhance the model’s sensitivity to small targets and mitigate the impact of redundant information on detection performance,a bi-level routing attention mechanism was introduced and embedded into the backbone network.Additionally,a multi-scale detection head was incorporated into the model,allowing for more comprehensive coverage of floating garbage of various sizes through multi-scale feature extraction and detection.The Focal-EIoU loss function was also employed to optimize the model parameters,improving localization accuracy.Experimental results on the publicly available FloW_Img dataset demonstrated that the improved YOLOv5 model outperforms the original YOLOv5 model in terms of precision and recall,achieving a mAP(mean average precision)of 86.12%,with significant improvements and faster convergence.
文摘The correct identification of traffic signs plays an important role in automatic driving technology and road safety driving.Therefore,to address the problems of misdetection and omission in traffic sign detection due to the variety of sign types,significant size differences and complex background information,an improved traffic sign detection model for RT-DETR was proposed in this study.Firstly,the HiLo attention mechanism was added to the Attention-based Intra-scale Feature Interaction,which further enhanced the feature extraction capability of the network and improved the detection efficiency on high-resolution images.Secondly,the CAFMFusion feature fusion mechanism was designed,which enabled the network to pay attention to the features in different regions in each channel.Based on this,the model could better capture the remote dependencies and neighborhood feature correlation,improving the feature fusion capability of the model.Finally,the MPDIoU was used as the loss function of the improved model to achieve faster convergence and more accurate regression results.The experimental results on the TT100k-2021 traffic sign dataset showed that the improved model achieves the performance with a precision value of 90.2%,recall value of 88.1%and mAP@0.5 value of 91.6%,which are 4.6%,5.8%,and 4.4%better than the original RT-DETR model respectively.The model effectively improves the problem of poor traffic sign detection and has greater practical value.
基金the Science and Technology Innovation 2030-Key Projects(Nos.2018AAA0102303,2018AAA0102403)the Aeronautical Science Foundation of China(No.20175851033)the National Natural Science Foundation of China(Nos.U1913602,U19B2033,91648205,61803011).
文摘Inspired by eagle’s visual system,an eagle-vision-based object detection method for unmanned aerial vehicle(UAV)formation in hazy weather is proposed in this paper.To restore the hazy image,the values of atmospheric light and transmission are estimated on the basis of the signal processing mechanism of ON and OFF channels in eagle’s retina.Local features of the dehazed image are calculated according to the color antagonism mechanism and contrast sensitivity function of eagle’s visual system.A center-surround operation is performed to simulate the response of reception field.The final saliency map is generated by the Random Forest algorithm.Experimental results verify that the proposed method is capable to detect UAVs in hazy image and has superior performance over traditional methods.
基金supported by the Fundamental Research Funds for Central Universities of the Civil Aviation University of China(No.3122021088).
文摘The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,the detection accuracy of some categories in the apron dataset was low.Therefore,an improved object detection method using spatial-aware features in apron scenes called SA-FRCNN is presented.The method uses graph convolutional networks to capture the relative spatial relationship between objects in the apron scene,incorporating this spatial context into feature learning.Moreover,an attention mechanism is introduced into the feature extraction process,with the goal to focus on the spatial position and key features,and distance-IoU loss is used to achieve a more accurate regression.The experimental results show that the mean average precision of the apron object detection based on SAFRCNN can reach 95.75%,and the detection effect of some hard-to-detect categories has been significantly improved.The proposed method effectively improves the detection accuracy on the apron dataset,which has a leading advantage over other methods.
基金supported by the Na‑tional Natural Science Foundation of China Joint Fund(No.U21B2028)the National Key R&D Program of China(No.2021YFC 2100100)the Shanghai Science and Technology Project(Nos.21JC1403400,23JC1402300).
文摘This study presents an innovative approach to improving the performance of YOLO-v8 model for small object detection in radar images.Initially,a local histogram equalization technique was applied to the original images,resulting in a notable enhancement in both contrast and detail representation.Subsequently,the YOLO-v8 backbone network was augmented by incorporating convolutional kernels based on a multidimensional attention mechanism and a parallel processing strategy,which facilitated more effective feature information fusion.At the model’s head,an upsampling layer was added,along with the fusion of outputs from the shallow network,and a detection head specifically tailored for small object detection,thereby further improving accuracy.Additionally,the loss function was modified to incorporate focal-intersection over union(IoU)in conjunction with scaled-IoU,which enhanced the model’s performance.A weighting strategy was also introduced,effectively improving detection accuracy for small targets.Experimental results demonstrate that the customized model outperforms traditional approaches across various evaluation metrics,including recall,precision,F1-score,and the receiver operating characteristic(ROC)curve,validating its efficacy and innovation in small object detection within radar imagery.The results indicate a substantial improvement in accuracy compared to conventional methods such as image segmentation and standard convolutional neural networks.
基金the China National Key Research and Development Program(Grant No.2016YFC0802904)National Natural Science Foundation of China(Grant No.61671470)62nd batch of funded projects of China Postdoctoral Science Foundation(Grant No.2017M623423)to provide fund for conducting experiments。
文摘Traditional object detectors based on deep learning rely on plenty of labeled samples,which are expensive to obtain.Few-shot object detection(FSOD)attempts to solve this problem,learning detection objects from a few labeled samples,but the performance is often unsatisfactory due to the scarcity of samples.We believe that the main reasons that restrict the performance of few-shot detectors are:(1)the positive samples is scarce,and(2)the quality of positive samples is low.Therefore,we put forward a novel few-shot object detector based on YOLOv4,starting from both improving the quantity and quality of positive samples.First,we design a hybrid multivariate positive sample augmentation(HMPSA)module to amplify the quantity of positive samples and increase positive sample diversity while suppressing negative samples.Then,we design a selective non-local fusion attention(SNFA)module to help the detector better learn the target features and improve the feature quality of positive samples.Finally,we optimize the loss function to make it more suitable for the task of FSOD.Experimental results on PASCAL VOC and MS COCO demonstrate that our designed few-shot object detector has competitive performance with other state-of-the-art detectors.
文摘A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.
基金supported in part by the“MOST”under Grant No.103-2221-E-216-012
文摘The region completeness of object detection is very crucial to video surveillance, such as the pedestrian and vehicle identifications. However, many conventional object detection approaches cannot guarantee the object region completeness because the object detection can be influenced by the illumination variations and clustering backgrounds. In order to overcome this problem, we propose the iterative superpixels grouping (ISPG) method to extract the precise object boundary and generate the object region with high completeness after the object detection. First, by extending the superpixel segmentation method, the proposed ISPG method can improve the inaccurate segmentation problem and guarantee the region completeness on the object regions. Second, the multi- resolution superpixel-based region completeness enhancement method is proposed to extract the object region with high precision and completeness. The simulation results show that the proposed method outperforms the conventional object detection methods in terms of object completeness evaluation.
基金supported by the National Natural Science Foundation of China(No.52174021)Key Research and Develop-ment Project of Hainan Province(No.ZDYF2022GXJS 003).
文摘The integrity and fineness characterization of non-connected regions and contours is a major challenge for existing salient object detection.The key to address is how to make full use of the subjective and objective structural information obtained in different steps.Therefore,by simulating the human visual mechanism,this paper proposes a novel multi-decoder matching correction network and subjective structural loss.Specifically,the loss pays different attentions to the foreground,boundary,and background of ground truth map in a top-down structure.And the perceived saliency is mapped to the corresponding objective structure of the prediction map,which is extracted in a bottom-up manner.Thus,multi-level salient features can be effectively detected with the loss as constraint.And then,through the mapping of improved binary cross entropy loss,the differences between salient regions and objects are checked to pay attention to the error prone region to achieve excellent error sensitivity.Finally,through tracking the identifying feature horizontally and vertically,the subjective and objective interaction is maximized.Extensive experiments on five benchmark datasets demonstrate that compared with 12 state-of-the-art methods,the algorithm has higher recall and precision,less error and strong robustness and generalization ability,and can predict complete and refined saliency maps.
文摘In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs.
基金supported by the National Key Research and Development Program of China(grant number:2017YFC0806503)。
文摘With the increasing number of vehicles,manual security inspections are becoming more laborious at road checkpoints.To address it,a specialized Road Checkpoints Robot(RCRo)system is proposed,incorporated with enhanced You Only Look Once(YOLO)and a 6-degree-of-freedom(DOF)manipulator,for autonomous identity verification and vehicle inspection.The modified YOLO is characterized by large objects’sensitivity and faster detection speed,named“LF-YOLO”.The better sensitivity of large objects and the faster detection speed are achieved by means of the Dense module-based backbone network connecting two-scale detecting network,for object detection tasks,along with optimized anchor boxes and improved loss function.During the manipulator motion,Octree-aided motion control scheme is adopted for collision-free motion through Robot Operating System(ROS).The proposed LF-YOLO which utilizes continuous optimization strategy and residual technique provides a promising detector design,which has been found to be more effective during actual object detection,in terms of decreased average detection time by 68.25%and 60.60%,and increased average Intersection over Union(Io U)by 20.74%and6.79%compared to YOLOv3 and YOLOv4 through experiments.The comprehensive functional tests of RCRo system demonstrate the feasibility and competency of the multiple unmanned inspections in practice.