期刊文献+

Detailed and clock-driven simulation for HPC interconnection network

Detailed and clock-driven simulation for HPC interconnection network
原文传递
导出
摘要 Performance and energy consumption of high performance computing (HPC) interconnection networks have a great significance in the whole supercomputer, and building up HPC interconnection network simulation plat- form is very important for the research on HPC software and hardware technologies. To effectively evaluate the per- formance and energy consumption of HPC interconnection networks, this article designs and implements a detailed and clock-driven HPC interconnection network simulation plat- form, called HPC-NetSim. HPC-NetSim uses application- driven workloads and inherits the characteristics of the de- tailed and flexible cycle-accurate network simulator. Besides, it offers a large set of configurable network parameters in terms of topology and routing, and supports router's on/off states. We compare the simulated execution time with the real execution time of Tianhe-2 subsystem and the mean error is only 2.7%. In addition, we simulate the network behaviors with different network structures and low-power modes. The results are also consistent with the theoretical analyses. Performance and energy consumption of high performance computing (HPC) interconnection networks have a great significance in the whole supercomputer, and building up HPC interconnection network simulation plat- form is very important for the research on HPC software and hardware technologies. To effectively evaluate the per- formance and energy consumption of HPC interconnection networks, this article designs and implements a detailed and clock-driven HPC interconnection network simulation plat- form, called HPC-NetSim. HPC-NetSim uses application- driven workloads and inherits the characteristics of the de- tailed and flexible cycle-accurate network simulator. Besides, it offers a large set of configurable network parameters in terms of topology and routing, and supports router's on/off states. We compare the simulated execution time with the real execution time of Tianhe-2 subsystem and the mean error is only 2.7%. In addition, we simulate the network behaviors with different network structures and low-power modes. The results are also consistent with the theoretical analyses.
出处 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第5期797-811,共15页 中国计算机科学前沿(英文版)
关键词 high performance computing clock-driven sim-ulation interconnection network BookSim high performance computing, clock-driven sim-ulation, interconnection network, BookSim
作者简介 Wenhao Zhou received the BS and MS de- grees in the School of Computer, National University of Defense Technology, China in 2013 and 2015. His research interests fo- cus on energy-aware HPC interconnection networks and parallel software framework.Juan Chen received the PhD degree in Computer Department, National University of Defense Technology (NUDT), China in 2007. She is now an associate profes- sor in Key Laboratory of High Perfor- mance Computing at NUDT. Her research interests focus on supercomputer systems,energy-aware interconnection network design, and parallel software framework.Chen Cui received the School of Electronics BS degree in the Engineering and Computer Science at Peking University, China in 2015, and now he is a MS student at National University of Defense Technol- ogy, China. His research interests focus on the large scale parallel numerical simula- tion and parallel software framework.Qian Wang received the BS degree in the School of Computer at National University of Defense Technology (NUDT), China in 2011, and now is a PhD student at NUDT. Her research interests focus on the large scale parallel numerical simulation and par- allel software framework.Dezun Dong received the BS, MS, and PhD degrees from the Nation'& University of Defense Technology (NUDT), China in 2002, 2004 and 2010, respectively. He is an associate professor in the Collage of Computer, NUDT. His research interests range across high performance computer systems, high speed interconnect networks,wireless networks, and distributed computing algorithms. Currently, he focuses on performance evaluation of high-performance inter- connection networks for supercomputers and data centers. He is a member of the ACM, 1EEE, and CCEYuhua Tang received her BS and MS de- grees in Computer Department from Na- tional University of Defense Technology (NUDT), China in 1983 and 1986, re- spectively. She is now a professor in National Laboratory for Paralleling and Dis- tributed Processing at NUDT. Her research interests include supercomputer architec-ture and core router's design.
  • 相关文献

参考文献4

二级参考文献40

  • 1Message Passing Interface Forum. MPI: A Message-Passing Interface Standard. University of Tennesses, Knoxville, TN, June 1995.
  • 2Petrini F, Kerbyson D J, Pakin S. The case of the missing supercomputer performance: Achieving optimal performance on the 8192 processors of ASCI Q. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing. Washington, DC: IEEE Computer Society, 2003.
  • 3Kerbyson D J, Alme H J, Hoisie A. Predictive performance and scalability modeling of a large-scale application. In: Proceedings of the 2001 ACM/IEEE conference on Supercomputing. New York: ACM, 2001.
  • 4Sundaram-Stukel D, Vernon M K. Predictive analysis of a wavefront application using LogGP. SIPLAN Notices, 1999, 34(8): 141-150.
  • 5Culler D. LogP: Towards a realistic model of parallel computation. In: Proceedings of 4th Symp Principles and Practice of Parallel Programming. New York: ACM, 1993. 1-12.
  • 6Alexandrov A, Ionescu M F, Schauser K E, et al. LogGP: Incorporating long messages into the logP model one step closer towards a realistic model for parallel computation. In: Proceedings of 7th ACM Symposium on Parallel Algorithms and Architectures. New York: ACM, 1995.
  • 7Frank M, Agarwal A, Vernon M K. LoPC: Modeling contention in parallel algorithms. In: Proceedings of 6th ACM SIGPLAN symposium on Principles and practice of parallel programming. New York: ACM, 1997.
  • 8Moritz C A, Frank M I. LoGPC: Modeling network contention in message-passing programs. IEEE Trans Parall Distr Syst, 2001, 12(4): 404-415.
  • 9Ino F, Fujimoto N, Hagihara K. LogGPS: a parallel computational model for synchronization analysis. In: Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programmng. New York: ACM, 2001.
  • 10Cameron K W, Ge R. Predicting and evaluating distributed communication performance. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing. Washington, DC: IEEE Computer Society, 2004.

共引文献49

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部