A dynamic multi-beam resource allocation algorithm for large low Earth orbit(LEO)constellation based on on-board distributed computing is proposed in this paper.The allocation is a combinatorial optimization process u...A dynamic multi-beam resource allocation algorithm for large low Earth orbit(LEO)constellation based on on-board distributed computing is proposed in this paper.The allocation is a combinatorial optimization process under a series of complex constraints,which is important for enhancing the matching between resources and requirements.A complex algorithm is not available because that the LEO on-board resources is limi-ted.The proposed genetic algorithm(GA)based on two-dimen-sional individual model and uncorrelated single paternal inheri-tance method is designed to support distributed computation to enhance the feasibility of on-board application.A distributed system composed of eight embedded devices is built to verify the algorithm.A typical scenario is built in the system to evalu-ate the resource allocation process,algorithm mathematical model,trigger strategy,and distributed computation architec-ture.According to the simulation and measurement results,the proposed algorithm can provide an allocation result for more than 1500 tasks in 14 s and the success rate is more than 91%in a typical scene.The response time is decreased by 40%com-pared with the conditional GA.展开更多
Fuzzy technology is a newly developed discipline based on fuzzy mathematics. In the recent years, it has been successfully applied into many areas, such as process control, diagnosis, evaluation, decision making and s...Fuzzy technology is a newly developed discipline based on fuzzy mathematics. In the recent years, it has been successfully applied into many areas, such as process control, diagnosis, evaluation, decision making and scheduling, especially in simulation where accurate mathematical models can not or very hard be established. In this paper, to meet the demands of fuzzy simulation, two fuzzy nets will first be presented, which are quite suitable for modeling the parallel or concurrent systems with fuzzy behavior. Then, a concept of active simulation will be introduced, in which the simulation model not only can show its fuzzy behavior, but also has a certain ability which can actively perform many very useful actions, such as automatic warning, realtime monitoring, simulation result checking, simulation model self-adapting, error recovery, simulating path tracing, system states inspecting and exception handling, by a unified approach while some specified events occur. The simulation model described by this powerful simulation modeling tool is concurrently driven by a network interpreter and an event monitor that all can be implemented by software or hardware. Besides, some interesting applications are given in the paper.展开更多
基于人类反馈的强化学习(reinforcement learning with human feedback,RLHF)作为当前大语言模型(large language models,LLMs)对齐的主流方法,其核心优化算法——近端策略优化(proximal policy optimization,PPO)却面临着显著的效率问...基于人类反馈的强化学习(reinforcement learning with human feedback,RLHF)作为当前大语言模型(large language models,LLMs)对齐的主流方法,其核心优化算法——近端策略优化(proximal policy optimization,PPO)却面临着显著的效率问题.PPO由生成、推理、训练3个相互关联的阶段组成,各个阶段有着不同的计算特性.然而,现有的RLHF并行框架采用相同并行策略顺序执行PPO的所有阶段,这导致以下2个问题:其一,生成阶段不能充分利用计算资源,进而影响整体效率;其二,阶段间严格串行执行,未能充分利用潜在并行性.针对上述问题,提出了一个新型RLHF并行框架——Pipe-RLHF.该框架能够自适应地根据各阶段的计算特征确定最优并行策略,突破现有阶段串行范式,采用异步PPO算法发掘阶段间的并行性.具体而言,创新性地提出了适用于PPO生成阶段的延迟批间流水线并行方法,显著提升了该阶段的计算资源利用率;再次,使用异步PPO解放阶段间的依赖关系,将阶段间并行应用到PPO的加速上;最后,针对PPO算法的整体优化,构建了分层并行策略空间,并提出了一套优化算法以实现该空间中的最优解搜索.通过在多个大语言模型上的性能评估实验表明,相较于现有方法,Pipe-RLHF最高可实现3.7倍的加速比,充分验证了该框架的有效性和优越性.展开更多
为研究异构多核片上系统(multi-processor system on chip,MPSoC)在密集并行计算任务中的潜力,文章设计并实现了一种适用于粗粒度数据特征、面向任务级并行应用的异构多核系统动态调度协处理器,采用了片上缓存、任务输出的多级写回管理...为研究异构多核片上系统(multi-processor system on chip,MPSoC)在密集并行计算任务中的潜力,文章设计并实现了一种适用于粗粒度数据特征、面向任务级并行应用的异构多核系统动态调度协处理器,采用了片上缓存、任务输出的多级写回管理、任务自动映射、通讯任务乱序执行等机制。实验结果表明,该动态调度协处理器不仅能够实现任务级乱序执行等基本设计目标,还具有极低的调度开销,相较于基于动态记分牌算法的调度器,运行多个子孔径距离压缩算法的时间降低达17.13%。研究结果证明文章设计的动态调度协处理器能够有效优化目标场景下的任务调度效果。展开更多
针对具有物理机制的分布式水文模型对大流域、长序列模拟计算时间长、模拟速度慢的问题,引入基于GPU的并行计算技术,实现分布式水文模型WEP-L(water and energy transfer processes in large river basins)产流过程的并行化。选择鄱阳...针对具有物理机制的分布式水文模型对大流域、长序列模拟计算时间长、模拟速度慢的问题,引入基于GPU的并行计算技术,实现分布式水文模型WEP-L(water and energy transfer processes in large river basins)产流过程的并行化。选择鄱阳湖流域为实验区,采用计算能力为8.6的NVIDIA RTX A4000对算法性能进行测试。研究表明:提出的基于GPU的分布式水文模型并行算法具有良好的加速效果,当线程总数越接近划分的子流域个数(计算任务量)时,并行性能越好,在实验流域WEP-L模型子流域单元为8712个时,加速比最大达到2.5左右;随着计算任务量的增加,加速比逐渐增大,当实验流域WEP-L模型子流域单元增加到24897个时,加速比能达到3.5,表明GPU并行算法在大尺度流域分布式水文模型计算中具有良好的发展潜力。展开更多
为了更好地实践云计算提供廉价按需服务的宗旨,提出了一种在模糊聚类基础上,基于两级调度模式的任务调度(FCTLBS,fuzzy clustering and two level based task scheduling)算法,新算法设置用户调度和任务调度2个等级。对资源进行性能模...为了更好地实践云计算提供廉价按需服务的宗旨,提出了一种在模糊聚类基础上,基于两级调度模式的任务调度(FCTLBS,fuzzy clustering and two level based task scheduling)算法,新算法设置用户调度和任务调度2个等级。对资源进行性能模糊聚类;根据任务参数计算资源偏好,使不同偏好任务在不同聚类中选择,缩小了选择范围,更好地反映了任务需求。仿真实验表明,本算法较之同类算法具备一定的优越性。展开更多
基金This work was supported by the National Key Research and Development Program of China(2021YFB2900603)the National Natural Science Foundation of China(61831008).
文摘A dynamic multi-beam resource allocation algorithm for large low Earth orbit(LEO)constellation based on on-board distributed computing is proposed in this paper.The allocation is a combinatorial optimization process under a series of complex constraints,which is important for enhancing the matching between resources and requirements.A complex algorithm is not available because that the LEO on-board resources is limi-ted.The proposed genetic algorithm(GA)based on two-dimen-sional individual model and uncorrelated single paternal inheri-tance method is designed to support distributed computation to enhance the feasibility of on-board application.A distributed system composed of eight embedded devices is built to verify the algorithm.A typical scenario is built in the system to evalu-ate the resource allocation process,algorithm mathematical model,trigger strategy,and distributed computation architec-ture.According to the simulation and measurement results,the proposed algorithm can provide an allocation result for more than 1500 tasks in 14 s and the success rate is more than 91%in a typical scene.The response time is decreased by 40%com-pared with the conditional GA.
文摘Fuzzy technology is a newly developed discipline based on fuzzy mathematics. In the recent years, it has been successfully applied into many areas, such as process control, diagnosis, evaluation, decision making and scheduling, especially in simulation where accurate mathematical models can not or very hard be established. In this paper, to meet the demands of fuzzy simulation, two fuzzy nets will first be presented, which are quite suitable for modeling the parallel or concurrent systems with fuzzy behavior. Then, a concept of active simulation will be introduced, in which the simulation model not only can show its fuzzy behavior, but also has a certain ability which can actively perform many very useful actions, such as automatic warning, realtime monitoring, simulation result checking, simulation model self-adapting, error recovery, simulating path tracing, system states inspecting and exception handling, by a unified approach while some specified events occur. The simulation model described by this powerful simulation modeling tool is concurrently driven by a network interpreter and an event monitor that all can be implemented by software or hardware. Besides, some interesting applications are given in the paper.
文摘基于人类反馈的强化学习(reinforcement learning with human feedback,RLHF)作为当前大语言模型(large language models,LLMs)对齐的主流方法,其核心优化算法——近端策略优化(proximal policy optimization,PPO)却面临着显著的效率问题.PPO由生成、推理、训练3个相互关联的阶段组成,各个阶段有着不同的计算特性.然而,现有的RLHF并行框架采用相同并行策略顺序执行PPO的所有阶段,这导致以下2个问题:其一,生成阶段不能充分利用计算资源,进而影响整体效率;其二,阶段间严格串行执行,未能充分利用潜在并行性.针对上述问题,提出了一个新型RLHF并行框架——Pipe-RLHF.该框架能够自适应地根据各阶段的计算特征确定最优并行策略,突破现有阶段串行范式,采用异步PPO算法发掘阶段间的并行性.具体而言,创新性地提出了适用于PPO生成阶段的延迟批间流水线并行方法,显著提升了该阶段的计算资源利用率;再次,使用异步PPO解放阶段间的依赖关系,将阶段间并行应用到PPO的加速上;最后,针对PPO算法的整体优化,构建了分层并行策略空间,并提出了一套优化算法以实现该空间中的最优解搜索.通过在多个大语言模型上的性能评估实验表明,相较于现有方法,Pipe-RLHF最高可实现3.7倍的加速比,充分验证了该框架的有效性和优越性.
文摘为研究异构多核片上系统(multi-processor system on chip,MPSoC)在密集并行计算任务中的潜力,文章设计并实现了一种适用于粗粒度数据特征、面向任务级并行应用的异构多核系统动态调度协处理器,采用了片上缓存、任务输出的多级写回管理、任务自动映射、通讯任务乱序执行等机制。实验结果表明,该动态调度协处理器不仅能够实现任务级乱序执行等基本设计目标,还具有极低的调度开销,相较于基于动态记分牌算法的调度器,运行多个子孔径距离压缩算法的时间降低达17.13%。研究结果证明文章设计的动态调度协处理器能够有效优化目标场景下的任务调度效果。
文摘针对具有物理机制的分布式水文模型对大流域、长序列模拟计算时间长、模拟速度慢的问题,引入基于GPU的并行计算技术,实现分布式水文模型WEP-L(water and energy transfer processes in large river basins)产流过程的并行化。选择鄱阳湖流域为实验区,采用计算能力为8.6的NVIDIA RTX A4000对算法性能进行测试。研究表明:提出的基于GPU的分布式水文模型并行算法具有良好的加速效果,当线程总数越接近划分的子流域个数(计算任务量)时,并行性能越好,在实验流域WEP-L模型子流域单元为8712个时,加速比最大达到2.5左右;随着计算任务量的增加,加速比逐渐增大,当实验流域WEP-L模型子流域单元增加到24897个时,加速比能达到3.5,表明GPU并行算法在大尺度流域分布式水文模型计算中具有良好的发展潜力。
文摘为了更好地实践云计算提供廉价按需服务的宗旨,提出了一种在模糊聚类基础上,基于两级调度模式的任务调度(FCTLBS,fuzzy clustering and two level based task scheduling)算法,新算法设置用户调度和任务调度2个等级。对资源进行性能模糊聚类;根据任务参数计算资源偏好,使不同偏好任务在不同聚类中选择,缩小了选择范围,更好地反映了任务需求。仿真实验表明,本算法较之同类算法具备一定的优越性。