摘要
作为开源云计算平台的核心技术之一,Map Reduce作业处理框架及其作业调度算法,对整个系统的性能起着至关重要的作用,而数据本地性是衡量作业调度算法好坏的一个重要标准,首先本文介绍和分析了Map Reduce基本原理,Map Reduce作业处理机制和Map Reduce作业调度机制及其在数据本地性方面表现出的优缺点等相关内容。其次,针对原生作业调度算法在数据本地性考虑不周全的问题,结合数据预取技术的可行性与优势,通过引入资源预取技术设计并实现一种基于资源预取的Hadoop Map Reduce作业调度算法,使作业执行效率更高。
As one of the core technologies in Hadoop,MapReduce framework and its job scheduling algorithm have a significant influence on performanceof the entire system,data locality determines the quality of scheduling algo-rithm.First of all,the introduction and analysis of the basic principles ,job processing mechanism , job scheduling me-chanism of MapReduce and the advantages and disadvantages about data locality on MapReduce are discussed and analyzed.Secondly,to address the problem that the existing Hadoop job scheduling algorithms cannot guarantee good data locality,combining with Data Prefetching technology,a Hadoop MapReduce job scheduling algorithm based on resource prefetching is proposed and implemented, the job executes more efficiently.
出处
《软件》
2015年第2期64-68,共5页
Software
作者简介
陈若飞,男,硕士研究生,北京交通大学计算机信息与技术学院工程系,主要研究方向:云计算
通讯联系人:姜文红,副教授,主要研究方向:网格计算,云计算