期刊文献+

一种基于PYNQ的神经网络加速系统 被引量:1

A neural network acceleration system based on PYNQ
在线阅读 下载PDF
导出
摘要 针对传统卷积神经网络计算复杂度高,耗时较长,难以应用到嵌入式移动端的问题,提出了一种以ZYNQ芯片作为主控的FPAG联合ARM实现的的神经网络加速系统。该系统的PL部分采用纯RTL开发,对卷积层的输入层和输出层进行了全并行化,对卷积窗口进行完全的展开,在一个时钟周期内可以同时完成81次乘法运算,同时对池化层和全连接层采用流水线的优化方式。相比常用的使用高层次综合工具进行优化的方法,该系统使用RTL语言从零开始设计卷积神经网络各个模块,进行了细粒度的优化,避免了冗余逻辑资源的产生,充分利用了片上资源。针对MINIST手写数字识别的网络模型,该系统的DSP利用率达到了95%,在100 MHz时钟频率下,硬件单帧图像处理时间仅为0.81 ms,功耗仅为1.601 W。 To address the problems of high computational complexity and time consuming application of traditional convolutional neural networks to embedded mobile,this paper proposes a neural network acceleration system based on the implementation of FPAG in conjunction with ARM with ZYNQ chip as the master control.The PL part of the system is developed in pure RTL,and the input and output layers of the convolutional layer are fully parallelized,the convolutional window is fully expanded,81 multiplications can be done simultaneously in one clock cycle,and the pooling and fully⁃connected layers are optimized in a fully pipelined way.Compared to the commonly used optimization methods using high⁃level synthesis tools,this system uses the RTL language to design each module of the convolutional neural network from scratch and performs fine⁃grained optimization to avoid the generation of redundant logic resources and make full use of on⁃chip resources.For the network model of MINIST handwritten digit recognition,the DSP utilization of this system reaches 95%,and the hardware single⁃frame image processing time is only 0.81 ms at a clock frequency of 100 MHz,and the power consumption is only 1.601 W.
作者 赖嘉伟 魏洪健 孙科学 王艳 LAI Jiawei;WEI Hongjian;SUN Kexue;WANG Yan(College of Electronic and Optical Engineering&College of Flexible Electronics(Future Technology),Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Nation-Local Joint Project Engineering Lab of RF Integration&Micropackage,Nanjing 210023,China)
出处 《电子设计工程》 2024年第17期16-21,共6页 Electronic Design Engineering
基金 江苏省研究生科研创新计划(SJCX22_0255)。
关键词 PYNQ ARM处理器 神经网络 现场可编程门阵列 硬件加速器 PYNQ ARM processor neural network Field Programmable Gate Array(FPGA) hard⁃ware accelerator
作者简介 赖嘉伟(1999-),男,江西萍乡人,硕士研究生。研究方向:智能信号处理;通信作者:孙科学,sunkx@njupt.edu.cn。
  • 相关文献

参考文献5

二级参考文献41

共引文献297

同被引文献13

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部