Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for...Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.展开更多
As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories ...As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories and autonomous driving.Due to the requirement for a large amount of storage space and computing resources,DNNs are unfavorable for resource-constrained edge computing devices,especially for mobile terminals with scarce energy supply.Binarization of DNN has become a promising technology to achieve a high performance with low resource consumption in edge computing.Field-programmable gate array(FPGA)-based acceleration can further improve the computation efficiency to several times higher compared with the central processing unit(CPU)and graphics processing unit(GPU).This paper gives a brief overview of binary neural networks(BNNs)and the corresponding hardware accelerator designs on edge computing environments,and analyzes some significant studies in detail.The performances of some methods are evaluated through the experiment results,and the latest binarization technologies and hardware acceleration methods are tracked.We first give the background of designing BNNs and present the typical types of BNNs.The FPGA implementation technologies of BNNs are then reviewed.Detailed comparison with experimental evaluation on typical BNNs and their FPGA implementation is further conducted.Finally,certain interesting directions are also illustrated as future work.展开更多
SRAM(Static Random Access Memory)型FPGA凭借其动态结构调整的灵活性等特点,被广泛应用于工业领域。针对动态可重构功能单元的布局问题,分析了模拟退火解决方案的局限性,提出了基于电路分层划分和时延驱动的在线布局算法。算法首先按...SRAM(Static Random Access Memory)型FPGA凭借其动态结构调整的灵活性等特点,被广泛应用于工业领域。针对动态可重构功能单元的布局问题,分析了模拟退火解决方案的局限性,提出了基于电路分层划分和时延驱动的在线布局算法。算法首先按最小分割原则将电路划分为一定数目的层,然后按自顶向下的原则在芯片的每一层中布局划分出的层,同时保证电路关键路径的延时最小。实验结果表明,所述算法在时延、线长和运行时间方面均优于VPR算法。展开更多
The modelling, design and implementation of a high-speed programmable polyphase finite impulse response (FIR) filter with field programmable gate array (FPGA) technology are described. This FIR filter can run automati...The modelling, design and implementation of a high-speed programmable polyphase finite impulse response (FIR) filter with field programmable gate array (FPGA) technology are described. This FIR filter can run automatically according to the programmable configuration word including symmetry/asymmetry, odd/even taps, from 32 taps up to 256 taps. The filter with 12 bit signal and 12 bit coefficient word-length has been realized on a Xilinx VirtexⅡ-v1500 device and operates at the maximum sampling frequency of (160 MHz.)展开更多
Combing with the generalized Hamiltonian system theory,by introducing a special form of sinusoidal function,a class of n-dimensional(n=1,2,3)controllable multi-scroll conservative chaos with complicated dynamics is co...Combing with the generalized Hamiltonian system theory,by introducing a special form of sinusoidal function,a class of n-dimensional(n=1,2,3)controllable multi-scroll conservative chaos with complicated dynamics is constructed.The dynamics characteristics including bifurcation behavior and coexistence of the system are analyzed in detail,the latter reveals abundant coexisting flows.Furthermore,the proposed system passes the NIST tests and has been implemented physically by FPGA.Compared to the multi-scroll dissipative chaos,the experimental portraits of the proposed system show better ergodicity,which have potential application value in secure communication and image encryption.展开更多
A flexible field programmable gate array based radar signal processor is presented. The radar signal processor mainly consists of five functional modules: radar system timer, binary phase coded pulse compression(PC...A flexible field programmable gate array based radar signal processor is presented. The radar signal processor mainly consists of five functional modules: radar system timer, binary phase coded pulse compression(PC), moving target detection (MTD), constant false alarm rate (CFAR) and target dots processing. Preliminary target dots information is obtained in PC, MTD, and CFAR modules and Nios I! CPU is used for target dots combination and false sidelobe target removing. Sys- tem on programmable chip (SOPC) technique is adopted in the system in which SDRAM is used to cache data. Finally, a FPGA-based binary phase coded radar signal processor is realized and simula- tion result is given.展开更多
In this paper,a fast-speed and real-time online rail inspection method based on half-cycle orthogonal power demodulation algorithm is proposed.For this method,the power characters of de-tection signal which represent ...In this paper,a fast-speed and real-time online rail inspection method based on half-cycle orthogonal power demodulation algorithm is proposed.For this method,the power characters of de-tection signal which represent the degree of rail track can be calculated using only half-cycle detec-tion signal because of the symmetry characteristic of detected sine signal and reference signal.The theoretical analysis,simulation results and experiment results show that the demodulation preci-sion of proposed method is almost equal to fast Fourier transform(FFT)demodulation method and orthogonal demodulation method,but has high demodulation efficiency and less FPGA resources cost.A high-speed experiment system based on three coils structured sensor is built for rail inspec-tion experiment at a moving speed of 200 km/h.The experiment results show that proposed meth-od is more effective for rail inspection and the time resolution of proposed method is double of clas-sic method that based on FFT and orthogonal.展开更多
A binary tree representation is designed in this paper for optimization of wave digital filter(WDF)implementation.To achieve this,an equivalent WDF model of the original circuit is converted into abinary tree repres...A binary tree representation is designed in this paper for optimization of wave digital filter(WDF)implementation.To achieve this,an equivalent WDF model of the original circuit is converted into abinary tree representation at first.This WDF binary tree can then be transformed to several topologies with the same implication,since the WDF adaptors have a symmetrical behavior on their ports.Because the WDF implementation is related to field programmable gate array(FPGA)resource usage and the cycle time of emulation,choosing aproper binary tree topology for WDF implementation can help balance the complexity and performance quality of an emulation system.Both WDF-FPGA emulation and HSpice simulation on the same circuit are tested.There is no significant difference between these two simulations.However,in terms of time consumption,the WDF-FPGA emulation has an advantage over the other.Our experiment also demonstrates that the optimized WDF-FPGA emulation has an acceptable accuracy and feasibility.展开更多
An internal single event upset(SEU)mitigation technique is proposed,which reads back the configuration frames from the static random access memory(SRAM)-based field programmable gate array(FPGA)through an intern...An internal single event upset(SEU)mitigation technique is proposed,which reads back the configuration frames from the static random access memory(SRAM)-based field programmable gate array(FPGA)through an internal port and compares them with those stored in the radiationhardened memory to detect and correct SEUs.Triple modular redundancy(TMR),which triplicates the circuit of the technique and uses majority voters to isolate any single upset within it,is used to enhance the reliability.Performance analysis shows that the proposed technique can satisfy the requirement of ordinary aerospace missions with less power dissipation,size and weight.The fault injection experiment validates that the proposed technique is capable of correcting most errors to protect spaceborne facilities from SEUs.展开更多
With the development of silicon photomultiplier(SiPM)technology,front-end electronics for SiPM signal processing have been highly sought after in various fields.A compact 64-channel front-end electronics(FEE)system ac...With the development of silicon photomultiplier(SiPM)technology,front-end electronics for SiPM signal processing have been highly sought after in various fields.A compact 64-channel front-end electronics(FEE)system achieved by fieldprogrammable gate array-based charge-to-digital converter(FPGA-QDC)technology was built and developed.The FEE consists of an analog board and FPGA board.The analog board incorporates commercial amplifiers,resistors,and capacitors.The FPGA board is composed of a low-cost FPGA.The electronics performance of the FEE was evaluated in terms of noise,linearity,and uniformity.A positron emission tomography(PET)detector with three different readout configurations was designed to validate the readout capability of the FEE for SiPM-based detectors.The PET detector was made of a 15×15 lutetium–yttrium oxyorthosilicate(LYSO)crystal array directly coupled with a SiPM array detector.The experimental results show that FEE can process dual-polarity charge signals from the SiPM detectors.In addition,it shows a good energy resolution for 511-keV gamma photons under the dual-end readout for the LYSO crystal array irradiated by a Na-22 source.Overall,the FEE based on FPGA-QDC shows promise for application in SiPM-based radiation detectors.展开更多
A bunch arrival-time monitor(BAM) system,based on electro-optical intensity modulation scheme, is under study at Shanghai Soft X-ray Free Electron Laser.The aim of the study is to achieve high-precision time measureme...A bunch arrival-time monitor(BAM) system,based on electro-optical intensity modulation scheme, is under study at Shanghai Soft X-ray Free Electron Laser.The aim of the study is to achieve high-precision time measurement for minimizing bunch fluctuations. A readout electronics is developed to fulfill the requirements of the BAM system. The readout electronics is mainly composed of a signal conditioning circuit, field-programmable gate array(FPGA), mezzanine card(FMC150), and powerful FPGA carrier board. The signal conditioning circuit converts the laser pulses into electrical pulse signals using a photodiode. Thereafter, it performs splitting and low-noise amplification to achieve the best voltage sampling performance of the dual-channel analog-to-digital converter(ADC) in FMC150. The FMC150 ADC daughter card includes a 14-bit 250 Msps dual-channel high-speed ADC,a clock configuration, and a management module. The powerful FPGA carrier board is a commercial high-performance Xilinx Kintex-7 FPGA evaluation board. To achieve clock and data alignment for ADC data capture at a high sampling rate, we used ISERDES, IDELAY, and dedicated carry-in resources in the Kintex-7 FPGA. This paper presents a detailed development of the readout electronics in the BAM system and its performance.展开更多
基金financially supported by the National Council for Scientific and Technological Development(CNPq,Brazil),Swedish-Brazilian Research and Innovation Centre(CISB),and Saab AB under Grant No.CNPq:200053/2022-1the National Council for Scientific and Technological Development(CNPq,Brazil)under Grants No.CNPq:312924/2017-8 and No.CNPq:314660/2020-8.
文摘Unmanned aerial vehicles(UAVs)have been widely used in military,medical,wireless communications,aerial surveillance,etc.One key topic involving UAVs is pose estimation in autonomous navigation.A standard procedure for this process is to combine inertial navigation system sensor information with the global navigation satellite system(GNSS)signal.However,some factors can interfere with the GNSS signal,such as ionospheric scintillation,jamming,or spoofing.One alternative method to avoid using the GNSS signal is to apply an image processing approach by matching UAV images with georeferenced images.But a high effort is required for image edge extraction.Here a support vector regression(SVR)model is proposed to reduce this computational load and processing time.The dynamic partial reconfiguration(DPR)of part of the SVR datapath is implemented to accelerate the process,reduce the area,and analyze its granularity by increasing the grain size of the reconfigurable region.Results show that the implementation in hardware is 68 times faster than that in software.This architecture with DPR also facilitates the low power consumption of 4 mW,leading to a reduction of 57%than that without DPR.This is also the lowest power consumption in current machine learning hardware implementations.Besides,the circuitry area is 41 times smaller.SVR with Gaussian kernel shows a success rate of 99.18%and minimum square error of 0.0146 for testing with the planning trajectory.This system is useful for adaptive applications where the user/designer can modify/reconfigure the hardware layout during its application,thus contributing to lower power consumption,smaller hardware area,and shorter execution time.
基金supported by the Natural Science Foundation of Sichuan Province of China under Grant No.2022NSFSC0500the National Natural Science Foundation of China under Grant No.62072076.
文摘As a core component in intelligent edge computing,deep neural networks(DNNs)will increasingly play a critically important role in addressing the intelligence-related issues in the industry domain,like smart factories and autonomous driving.Due to the requirement for a large amount of storage space and computing resources,DNNs are unfavorable for resource-constrained edge computing devices,especially for mobile terminals with scarce energy supply.Binarization of DNN has become a promising technology to achieve a high performance with low resource consumption in edge computing.Field-programmable gate array(FPGA)-based acceleration can further improve the computation efficiency to several times higher compared with the central processing unit(CPU)and graphics processing unit(GPU).This paper gives a brief overview of binary neural networks(BNNs)and the corresponding hardware accelerator designs on edge computing environments,and analyzes some significant studies in detail.The performances of some methods are evaluated through the experiment results,and the latest binarization technologies and hardware acceleration methods are tracked.We first give the background of designing BNNs and present the typical types of BNNs.The FPGA implementation technologies of BNNs are then reviewed.Detailed comparison with experimental evaluation on typical BNNs and their FPGA implementation is further conducted.Finally,certain interesting directions are also illustrated as future work.
文摘SRAM(Static Random Access Memory)型FPGA凭借其动态结构调整的灵活性等特点,被广泛应用于工业领域。针对动态可重构功能单元的布局问题,分析了模拟退火解决方案的局限性,提出了基于电路分层划分和时延驱动的在线布局算法。算法首先按最小分割原则将电路划分为一定数目的层,然后按自顶向下的原则在芯片的每一层中布局划分出的层,同时保证电路关键路径的延时最小。实验结果表明,所述算法在时延、线长和运行时间方面均优于VPR算法。
文摘The modelling, design and implementation of a high-speed programmable polyphase finite impulse response (FIR) filter with field programmable gate array (FPGA) technology are described. This FIR filter can run automatically according to the programmable configuration word including symmetry/asymmetry, odd/even taps, from 32 taps up to 256 taps. The filter with 12 bit signal and 12 bit coefficient word-length has been realized on a Xilinx VirtexⅡ-v1500 device and operates at the maximum sampling frequency of (160 MHz.)
基金Project supported by the Natural Science Foundation of Tianjin,China(Grant No.18JCYBJC87700)the Natural Science Foundation of China(Grant No.61603274)。
文摘Combing with the generalized Hamiltonian system theory,by introducing a special form of sinusoidal function,a class of n-dimensional(n=1,2,3)controllable multi-scroll conservative chaos with complicated dynamics is constructed.The dynamics characteristics including bifurcation behavior and coexistence of the system are analyzed in detail,the latter reveals abundant coexisting flows.Furthermore,the proposed system passes the NIST tests and has been implemented physically by FPGA.Compared to the multi-scroll dissipative chaos,the experimental portraits of the proposed system show better ergodicity,which have potential application value in secure communication and image encryption.
基金Supported by the Ministerial Level Advanced Research Foundation (SP240012)
文摘A flexible field programmable gate array based radar signal processor is presented. The radar signal processor mainly consists of five functional modules: radar system timer, binary phase coded pulse compression(PC), moving target detection (MTD), constant false alarm rate (CFAR) and target dots processing. Preliminary target dots information is obtained in PC, MTD, and CFAR modules and Nios I! CPU is used for target dots combination and false sidelobe target removing. Sys- tem on programmable chip (SOPC) technique is adopted in the system in which SDRAM is used to cache data. Finally, a FPGA-based binary phase coded radar signal processor is realized and simula- tion result is given.
基金supported by the National Natural Science Found-ation of China(No.61771041).
文摘In this paper,a fast-speed and real-time online rail inspection method based on half-cycle orthogonal power demodulation algorithm is proposed.For this method,the power characters of de-tection signal which represent the degree of rail track can be calculated using only half-cycle detec-tion signal because of the symmetry characteristic of detected sine signal and reference signal.The theoretical analysis,simulation results and experiment results show that the demodulation preci-sion of proposed method is almost equal to fast Fourier transform(FFT)demodulation method and orthogonal demodulation method,but has high demodulation efficiency and less FPGA resources cost.A high-speed experiment system based on three coils structured sensor is built for rail inspec-tion experiment at a moving speed of 200 km/h.The experiment results show that proposed meth-od is more effective for rail inspection and the time resolution of proposed method is double of clas-sic method that based on FFT and orthogonal.
基金Supported by the National Natural Science Foundation of China(61271113)
文摘A binary tree representation is designed in this paper for optimization of wave digital filter(WDF)implementation.To achieve this,an equivalent WDF model of the original circuit is converted into abinary tree representation at first.This WDF binary tree can then be transformed to several topologies with the same implication,since the WDF adaptors have a symmetrical behavior on their ports.Because the WDF implementation is related to field programmable gate array(FPGA)resource usage and the cycle time of emulation,choosing aproper binary tree topology for WDF implementation can help balance the complexity and performance quality of an emulation system.Both WDF-FPGA emulation and HSpice simulation on the same circuit are tested.There is no significant difference between these two simulations.However,in terms of time consumption,the WDF-FPGA emulation has an advantage over the other.Our experiment also demonstrates that the optimized WDF-FPGA emulation has an acceptable accuracy and feasibility.
基金Supported by the National High Technology and Development Program of China(2013AA1548)
文摘An internal single event upset(SEU)mitigation technique is proposed,which reads back the configuration frames from the static random access memory(SRAM)-based field programmable gate array(FPGA)through an internal port and compares them with those stored in the radiationhardened memory to detect and correct SEUs.Triple modular redundancy(TMR),which triplicates the circuit of the technique and uses majority voters to isolate any single upset within it,is used to enhance the reliability.Performance analysis shows that the proposed technique can satisfy the requirement of ordinary aerospace missions with less power dissipation,size and weight.The fault injection experiment validates that the proposed technique is capable of correcting most errors to protect spaceborne facilities from SEUs.
基金supported by the Natural Science Foundation of Shandong Province (No. ZR2022QA039)the Program of Qilu Young Scholars of Shandong University
文摘With the development of silicon photomultiplier(SiPM)technology,front-end electronics for SiPM signal processing have been highly sought after in various fields.A compact 64-channel front-end electronics(FEE)system achieved by fieldprogrammable gate array-based charge-to-digital converter(FPGA-QDC)technology was built and developed.The FEE consists of an analog board and FPGA board.The analog board incorporates commercial amplifiers,resistors,and capacitors.The FPGA board is composed of a low-cost FPGA.The electronics performance of the FEE was evaluated in terms of noise,linearity,and uniformity.A positron emission tomography(PET)detector with three different readout configurations was designed to validate the readout capability of the FEE for SiPM-based detectors.The PET detector was made of a 15×15 lutetium–yttrium oxyorthosilicate(LYSO)crystal array directly coupled with a SiPM array detector.The experimental results show that FEE can process dual-polarity charge signals from the SiPM detectors.In addition,it shows a good energy resolution for 511-keV gamma photons under the dual-end readout for the LYSO crystal array irradiated by a Na-22 source.Overall,the FEE based on FPGA-QDC shows promise for application in SiPM-based radiation detectors.
基金supported by the National Key R&D Plan(No.2016YFA0401900)
文摘A bunch arrival-time monitor(BAM) system,based on electro-optical intensity modulation scheme, is under study at Shanghai Soft X-ray Free Electron Laser.The aim of the study is to achieve high-precision time measurement for minimizing bunch fluctuations. A readout electronics is developed to fulfill the requirements of the BAM system. The readout electronics is mainly composed of a signal conditioning circuit, field-programmable gate array(FPGA), mezzanine card(FMC150), and powerful FPGA carrier board. The signal conditioning circuit converts the laser pulses into electrical pulse signals using a photodiode. Thereafter, it performs splitting and low-noise amplification to achieve the best voltage sampling performance of the dual-channel analog-to-digital converter(ADC) in FMC150. The FMC150 ADC daughter card includes a 14-bit 250 Msps dual-channel high-speed ADC,a clock configuration, and a management module. The powerful FPGA carrier board is a commercial high-performance Xilinx Kintex-7 FPGA evaluation board. To achieve clock and data alignment for ADC data capture at a high sampling rate, we used ISERDES, IDELAY, and dedicated carry-in resources in the Kintex-7 FPGA. This paper presents a detailed development of the readout electronics in the BAM system and its performance.