摘要
针对卷积神经网络算法FPGA硬件加速器存在的内存带宽瓶颈,提出了一种基于次级缓存的行重组调度策略。通过分析SDRAM存储器的性能、FPGA硬件加速原理和内存带宽瓶颈,建立了次级缓存机制。该机制可服务于加速过程中堆叠的访问请求,通过合并相同Bank/Row的访问请求,减少Active和Precharge操作的额外开销。实验测试结果表明,在SC-RR调度策略下,存储器的访存时间减少32.87%,功耗降低31.71%,有效带宽利用率提高到91.3%。在性能相近的情况下,硬件资源消耗减少83.8%,满足了设计要求。
Aiming at the memory bandwidth bottleneck of FPGA hardware accelerator of convolutional neural network algorithm,this paper proposes a Secondary Cache-Row Recombination(SC-RR)based on secondary cache.By analyzing the performance of SDRAM memory,FPGA hardware acceleration principle and memory bandwidth bottleneck,a secondary cache mechanism is established.This mechanism can serve the stacked access requests during the acceleration process,reducing the additional overhead of Active and Precharge operations by merging access requests from the same Bank/Row.The experimental test results show that under the SC-RR scheduling strategy,the memory access time is reduced by 32.87%,the power consumption is reduced by 31.71%,and the effective bandwidth utilization is increased to 91.3%.In the case of similar performance,hardware resource consumption is reduced by 83.8%,which meets the design requirements.
作者
杜忠文
李庚霖
蒋菡
褚江恒
伍俊
Du Zhongwen;Li Genglin;Jiang Han;Chu Jiangheng;Wu Jun(School of Optoelectronic Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Chongqing Key Laboratory of Cross-scale Manufacturing Technology,Chongqing Institute of Green and Intelligent Technology,Chinese Academy of Sciences,Chongqing 400722,China;Chongqing College,University of Chinese Academy of Sciences,Chongqing 400714,China)
出处
《电子测量技术》
北大核心
2023年第14期37-42,共6页
Electronic Measurement Technology
基金
重庆英才创新领军人才项目(CQYC201903020)
重庆市杰出青年基金(cstc2019jcyjjqX0017)项目资助
作者简介
杜忠文,硕士研究生,主要研究方向为基于FPGA的QSPI通讯系统设计。E-mail:942333266@qq.com;通信作者:伍俊,博士研究生,高级工程师,主要研究方向为红外图像处理系统设计,红外目标检测及红外图像增强算法的应用研究。E-mail:wujun@cigit.ac.cn