This article proposes a three-dimensional light field reconstruction method based on neural radiation field(NeRF)called Infrared NeRF for low resolution thermal infrared scenes.Based on the characteristics of the low ...This article proposes a three-dimensional light field reconstruction method based on neural radiation field(NeRF)called Infrared NeRF for low resolution thermal infrared scenes.Based on the characteristics of the low resolution thermal infrared imaging,various optimizations have been carried out to improve the speed and accuracy of thermal infrared 3D reconstruction.Firstly,inspired by Boltzmann's law of thermal radiation,distance is incorporated into the NeRF model for the first time,resulting in a nonlinear propagation of a single ray and a more accurate description of the physical property that infrared radiation intensity decreases with increasing distance.Secondly,in terms of improving inference speed,based on the phenomenon of high and low frequency distribution of foreground and background in infrared images,a multi ray non-uniform light synthesis strategy is proposed to make the model pay more attention to foreground objects in the scene,reduce the distribution of light in the background,and significantly reduce training time without reducing accuracy.In addition,compared to visible light scenes,infrared images only have a single channel,so fewer network parameters are required.Experiments using the same training data and data filtering method showed that,compared to the original NeRF,the improved network achieved an average improvement of 13.8%and 4.62%in PSNR and SSIM,respectively,while an average decreases of 46%in LPIPS.And thanks to the optimization of network layers and data filtering methods,training only takes about 25%of the original method's time to achieve convergence.Finally,for scenes with weak backgrounds,this article improves the inference speed of the model by 4-6 times compared to the original NeRF by limiting the query interval of the model.展开更多
This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,an...This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.展开更多
A p-i-i-n type AlG a N heterostructure avalanche photodiodes(APDs)is proposed to decrease the avalanche breakdown voltage and to realize higher gain by using high-Al-content AlG aN layer as multiplication layer and lo...A p-i-i-n type AlG a N heterostructure avalanche photodiodes(APDs)is proposed to decrease the avalanche breakdown voltage and to realize higher gain by using high-Al-content AlG aN layer as multiplication layer and low-Al-content AlG aN layer as absorption layer.The calculated results show that the designed APD can significantly reduce the breakdown voltage by almost 30%,and about sevenfold increase of maximum gain compared to the conventional Al GaN APD.The noise in designed APD is also less than that in conventional APD due to its low dark current at the breakdown voltage point.Moreover,the one-dimensional(1D)dual-periodic photonic crystal(PC)with anti-reflection coating filter is designed to achieve the solar-blind characteristic and cutoff wavelength of 282 nm is obtained.展开更多
The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduce...The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.展开更多
基金Support by the Fundamental Research Funds for the Central Universities(2024300443)the National Natural Science Foundation of China(NSFC)Young Scientists Fund(62405131)。
文摘This article proposes a three-dimensional light field reconstruction method based on neural radiation field(NeRF)called Infrared NeRF for low resolution thermal infrared scenes.Based on the characteristics of the low resolution thermal infrared imaging,various optimizations have been carried out to improve the speed and accuracy of thermal infrared 3D reconstruction.Firstly,inspired by Boltzmann's law of thermal radiation,distance is incorporated into the NeRF model for the first time,resulting in a nonlinear propagation of a single ray and a more accurate description of the physical property that infrared radiation intensity decreases with increasing distance.Secondly,in terms of improving inference speed,based on the phenomenon of high and low frequency distribution of foreground and background in infrared images,a multi ray non-uniform light synthesis strategy is proposed to make the model pay more attention to foreground objects in the scene,reduce the distribution of light in the background,and significantly reduce training time without reducing accuracy.In addition,compared to visible light scenes,infrared images only have a single channel,so fewer network parameters are required.Experiments using the same training data and data filtering method showed that,compared to the original NeRF,the improved network achieved an average improvement of 13.8%and 4.62%in PSNR and SSIM,respectively,while an average decreases of 46%in LPIPS.And thanks to the optimization of network layers and data filtering methods,training only takes about 25%of the original method's time to achieve convergence.Finally,for scenes with weak backgrounds,this article improves the inference speed of the model by 4-6 times compared to the original NeRF by limiting the query interval of the model.
基金Supported by the Fundamental Research Funds for the Central Universities(2024300443)the Natural Science Foundation of Jiangsu Province(BK20241224).
文摘This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.
基金supported by Anhui University Natural Science Research Project, China (KJ2015A153)Initial research fund from Chuzhou University, China (2014qd024)+1 种基金The Higher Education Excellent Youth Talents Foundation of Anhui Province (gxyqZ D2016329)the Anhui Provincial Natural Science Foundation of China under Grant (1708085MF149)
文摘A p-i-i-n type AlG a N heterostructure avalanche photodiodes(APDs)is proposed to decrease the avalanche breakdown voltage and to realize higher gain by using high-Al-content AlG aN layer as multiplication layer and low-Al-content AlG aN layer as absorption layer.The calculated results show that the designed APD can significantly reduce the breakdown voltage by almost 30%,and about sevenfold increase of maximum gain compared to the conventional Al GaN APD.The noise in designed APD is also less than that in conventional APD due to its low dark current at the breakdown voltage point.Moreover,the one-dimensional(1D)dual-periodic photonic crystal(PC)with anti-reflection coating filter is designed to achieve the solar-blind characteristic and cutoff wavelength of 282 nm is obtained.
基金Project(08Y29-7)supported by the Transportation Science and Research Program of Jiangsu Province,ChinaProject(201103051)supported by the Major Infrastructure Program of the Health Monitoring System Hardware Platform Based on Sensor Network Node,China+1 种基金Project(61100111)supported by the National Natural Science Foundation of ChinaProject(BE2011169)supported by the Scientific and Technical Supporting Program of Jiangsu Province,China
文摘The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.