摘要
针对将应用移植到CPU-GPU异构并行系统上时优化策略各自分散、没有一个全局的指导思想的问题,提出了一种基于剖分的全局性能优化方法.该方法由优化策略库、剖分工具库和策略配置模块组成.优化策略库将应用移植到异构并行系统上的性能优化过程划分为访存级、内核加速级和数据划分级3级优化;针对3级优化剖分工具库提供了3级剖分机制,通过运行时的剖分技术获取剖分信息;策略配置模块根据所获取的信息指导用户在每级优化中选择合适的优化策略.实验证明,基于剖分的全局性能优化方法可以明确地指导将应用移植到CPU-GPU异构并行系统上的全局优化过程,利用该优化方法后,以矩阵相乘和傅里叶变换为例的应用性能提升明显,最终性能相对于访存级优化最高可提高30%左右.
A profiling based optimization method for CPU-GPU heterogeneous parallel processing system is proposed to address the problem that the present optimization strategies get sectional thus failed to guide a global optimization.It is composed of the optimization strategy library,the profiling tool library,and the strategy deploy module,and the optimization strategy library divides the performance promotion process into a three-level optimization,including the memory-access level,the kernel-speedup level,and the data-partition level.The profiling tool library realizes three-level profiling mechanisms towards three-level optimizations to obtain application information,and the strategy deploy module guides users to choose an adaptive strategy with the information obtained by profiling tool library.Experimental results show that the proposed one is able to guide the optimization process of applications transplanted to heterogeneous parallel system.The performance for matrix multiplication and fast Fourier transform are improved obviously,and the final performance is heightened by 30% compared with the memory-level optimization.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2012年第2期17-23,共7页
Journal of Xi'an Jiaotong University
基金
国家高技术研究发展计划资助项目(2009AA01A135
2009AA01Z108)
中央高校基本科研业务费专项资金资助项目(08142007)
关键词
CPU-GPU异构并行系统
全局优化
3级优化
3级剖分
CPU-GPU heterogeneous parallel system
global optimization
third-level optimization
third-level profiling
作者简介
张保(1987-),男,硕士生;
董小社(通信作者),男,教授,博士生导师.