Detailed and clock-driven simulation for HPC interconnection network

Detailed and clock-driven simulation for HPC interconnection network

导出

摘要 Performance and energy consumption of high performance computing （HPC） interconnection networks have a great significance in the whole supercomputer, and building up HPC interconnection network simulation plat- form is very important for the research on HPC software and hardware technologies. To effectively evaluate the per- formance and energy consumption of HPC interconnection networks, this article designs and implements a detailed and clock-driven HPC interconnection network simulation plat- form, called HPC-NetSim. HPC-NetSim uses application- driven workloads and inherits the characteristics of the de- tailed and flexible cycle-accurate network simulator. Besides, it offers a large set of configurable network parameters in terms of topology and routing, and supports router＇s on/off states. We compare the simulated execution time with the real execution time of Tianhe-2 subsystem and the mean error is only 2.7%. In addition, we simulate the network behaviors with different network structures and low-power modes. The results are also consistent with the theoretical analyses. Performance and energy consumption of high performance computing （HPC） interconnection networks have a great significance in the whole supercomputer, and building up HPC interconnection network simulation plat- form is very important for the research on HPC software and hardware technologies. To effectively evaluate the per- formance and energy consumption of HPC interconnection networks, this article designs and implements a detailed and clock-driven HPC interconnection network simulation plat- form, called HPC-NetSim. HPC-NetSim uses application- driven workloads and inherits the characteristics of the de- tailed and flexible cycle-accurate network simulator. Besides, it offers a large set of configurable network parameters in terms of topology and routing, and supports router＇s on/off states. We compare the simulated execution time with the real execution time of Tianhe-2 subsystem and the mean error is only 2.7%. In addition, we simulate the network behaviors with different network structures and low-power modes. The results are also consistent with the theoretical analyses.

作者 Wenhao ZHOU Juan CHEN Chen CUI Qian WANG Dezun DONG Yuhua TANG

机构地区 State Key Laboratory of High Performance Computing Science and Technology on Parallel and Distributed Processing Laboratory

出处《Frontiers of Computer Science》 SCIE EI CSCD 2016年第5期797-811,共15页 中国计算机科学前沿（英文版）

关键词 high performance computing clock-driven sim-ulation interconnection network BookSim high performance computing, clock-driven sim-ulation, interconnection network, BookSim

分类号 TP393 [自动化与计算机技术—计算机应用技术] TP393.01 [自动化与计算机技术—计算机应用技术]

作者简介 Wenhao Zhou received the BS and MS de- grees in the School of Computer, National University of Defense Technology, China in 2013 and 2015. His research interests fo- cus on energy-aware HPC interconnection networks and parallel software framework.Juan Chen received the PhD degree in Computer Department, National University of Defense Technology （NUDT）, China in 2007. She is now an associate profes- sor in Key Laboratory of High Perfor- mance Computing at NUDT. Her research interests focus on supercomputer systems,energy-aware interconnection network design, and parallel software framework.Chen Cui received the School of Electronics BS degree in the Engineering and Computer Science at Peking University, China in 2015, and now he is a MS student at National University of Defense Technol- ogy, China. His research interests focus on the large scale parallel numerical simula- tion and parallel software framework.Qian Wang received the BS degree in the School of Computer at National University of Defense Technology （NUDT）, China in 2011, and now is a PhD student at NUDT. Her research interests focus on the large scale parallel numerical simulation and par- allel software framework.Dezun Dong received the BS, MS, and PhD degrees from the Nation＇＆ University of Defense Technology （NUDT）, China in 2002, 2004 and 2010, respectively. He is an associate professor in the Collage of Computer, NUDT. His research interests range across high performance computer systems, high speed interconnect networks,wireless networks, and distributed computing algorithms. Currently, he focuses on performance evaluation of high-performance inter- connection networks for supercomputers and data centers. He is a member of the ACM, 1EEE, and CCEYuhua Tang received her BS and MS de- grees in Computer Department from Na- tional University of Defense Technology （NUDT）, China in 1983 and 1986, re- spectively. She is now a professor in National Laboratory for Paralleling and Dis- tributed Processing at NUDT. Her research interests include supercomputer architec-ture and core router＇s design.

引文网络
相关文献

参考文献4

1Xiangke LIAO.MilkyWay-2： back to the world Top 1[J].Frontiers of Computer Science,2014,8(3):343-344. 被引量：1
2Xiangke LIAO,Liquan XIAO,Canqun YANG,Yutong LU.MilkyWay-2 supercomputer： system and application[J].Frontiers of Computer Science,2014,8(3):345-356. 被引量：35
3CHEN WenGuang ZHAI JiDong ZHANG Jin ZHENG WeiMin.LogGPO:An accurate communication model for performance prediction of MPI programs[J].Science in China(Series F),2009,52(10):1785-1791. 被引量：3
4Zhengbin PANG,Min XIE,Jun ZHANG,Yi ZHENG,Guibin WANG,Dezun DONG,Guang SUO.The TH Express high performance interconnect networks[J].Frontiers of Computer Science,2014,8(3):357-366. 被引量：17

二级参考文献40

1Message Passing Interface Forum. MPI: A Message-Passing Interface Standard. University of Tennesses, Knoxville, TN, June 1995.
2Petrini F, Kerbyson D J, Pakin S. The case of the missing supercomputer performance: Achieving optimal performance on the 8192 processors of ASCI Q. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing. Washington, DC: IEEE Computer Society, 2003.
3Kerbyson D J, Alme H J, Hoisie A. Predictive performance and scalability modeling of a large-scale application. In: Proceedings of the 2001 ACM/IEEE conference on Supercomputing. New York: ACM, 2001.
4Sundaram-Stukel D, Vernon M K. Predictive analysis of a wavefront application using LogGP. SIPLAN Notices, 1999, 34(8): 141-150.
5Culler D. LogP: Towards a realistic model of parallel computation. In: Proceedings of 4th Symp Principles and Practice of Parallel Programming. New York: ACM, 1993. 1-12.
6Alexandrov A, Ionescu M F, Schauser K E, et al. LogGP: Incorporating long messages into the logP model one step closer towards a realistic model for parallel computation. In: Proceedings of 7th ACM Symposium on Parallel Algorithms and Architectures. New York: ACM, 1995.
7Frank M, Agarwal A, Vernon M K. LoPC: Modeling contention in parallel algorithms. In: Proceedings of 6th ACM SIGPLAN symposium on Principles and practice of parallel programming. New York: ACM, 1997.
8Moritz C A, Frank M I. LoGPC: Modeling network contention in message-passing programs. IEEE Trans Parall Distr Syst, 2001, 12(4): 404-415.
9Ino F, Fujimoto N, Hagihara K. LogGPS: a parallel computational model for synchronization analysis. In: Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programmng. New York: ACM, 2001.
10Cameron K W, Ge R. Predicting and evaluating distributed communication performance. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing. Washington, DC: IEEE Computer Society, 2004.

共引文献49

1袁良,张云泉.基于横向局部性的多核计算模型[J].计算机科学,2012,39(7):1-6. 被引量：2
2Junjie Wu,Yong Liu,Baida Zhang,Xianmin Jin,Yang Wang,Huiquan Wang,Xuejun Yang.A benchmark test of boson sampling on Tianhe-2 supercomputer[J].National Science Review,2018,5(5):715-720. 被引量：7
3Xuchao XIE,Liquan XIAO,Dengping WEI,Qiong LI,Zhenlong SONG,Xiongzi GE.Pinpointing and scheduling access conflicts to improve internal resource utilization in solid-state drives[J].Frontiers of Computer Science,2019,13(1):35-50. 被引量：2
4廖湘科,庞征,王克非,卢宇彤,谢旻,夏军,董德尊,所光.High Performance Interconnect Network for Tianhe System[J].Journal of Computer Science & Technology,2015,30(2):259-272. 被引量：23
5方翔,李宁求,付小哲,李凯彬,林强,刘礼辉,石存斌,吴淑勤.基于“天河二号”的水产病原生物信息分析平台构建及其在水产病原分析中的应用[J].遗传,2015,37(7):702-710. 被引量：3
6张建民,黎铁军,李思昆.一种并行计算机互连网络中的地址转换Cache[J].计算机研究与发展,2016,53(2):390-398.
7廖湘科,谭郁松,卢宇彤,谢旻,周恩强,黄杰.面向大数据应用挑战的超级计算机设计[J].上海大学学报（自然科学版）,2016,22(1):3-16. 被引量：15
8Haohuan FU,Junfeng LIAO,Jinzhe YANG,Lanning WANG,Zhenya SONG,Xiaomeng HUANG,Chao YANG,Wei XUE,Fangfang LIU,Fangli QIAO,Wei ZHAO,Xunqiang YIN,Chaofeng HOU,Chenglong ZHANG,Wei GE,Jian ZHANG,Yangang WANG,Chunbo ZHOU,Guangwen YANG.The Sunway TaihuLight supercomputer： system and applications[J].Science China(Information Sciences),2016,59(7):109-124. 被引量：63
9LI Tie-jun,ZHANG Jian-min,MA Ke-fan,XIAO Li-quan,LI Si-kun.Virtual and physical address translation mechanism of interconnect network[J].Journal of Beijing Institute of Technology,2016,25(3):365-374.
10莫则尧,张爱清,刘青凯,曹小林.并行算法与并行编程:从个性、共性到软件复用[J].中国科学：信息科学,2016,46(10):1392-1410. 被引量：9

1叶兴.国际计算机互连网络——Internet简介[J].江苏科技信息,1995,12(12):14-15.
2朱素钦,陈宝兴,钟玮.一些新的紧优与次紧优无向双环网络无限族[J].计算机工程与应用,2011,47(6):113-115.
3李星.中国教育和科研计算机网五周年回顾与展望[J].中国电信建设,1999,11(7):16-18. 被引量：2
4陈贞.Internet2的发展现状[J].福建电脑,2003,19(1):7-7.
5王继龙,吴建平.计算机互连网络性能管理体系结构模型研究[J].计算机工程,2000,26(8):1-3. 被引量：4
6陈业斌,周建钦.双环网络直径的对称性及应用[J].计算机技术与发展,2006,16(3):155-157. 被引量：4
7周建钦,徐喜荣.双环网络G(N;±r,±s)的紧优性[J].安徽工业大学学报（自然科学版）,2006,23(1):85-87. 被引量：1
8张仲杰.国际计算机互连网络——Internet概述[J].通信工程,1995(3):16-27.
9陈婷,肖利民,阮利.高效能计算机互连网络拓扑结构的建模与仿真[J].华中科技大学学报（自然科学版）,2010,38(S1):25-30.
10徐俊明.不含紧优和几乎紧优双环网络无限族[J].科学通报,1999,44(5):486-490. 被引量：26

Frontiers of Computer Science

2016年第5期

浏览历史

内容加载中请稍等...

Detailed and clock-driven simulation for HPC interconnection network

参考文献4

二级参考文献40

共引文献49

相关作者

相关机构

相关主题

浏览历史