Decentralized robust stabilization problem of discrete-time fuzzy large-scale systems with parametric uncertainties is considered. This uncertain fuzzy large-scale system consists of N interconnected T-S fuzzy subsyst...Decentralized robust stabilization problem of discrete-time fuzzy large-scale systems with parametric uncertainties is considered. This uncertain fuzzy large-scale system consists of N interconnected T-S fuzzy subsystems, and the parametric uncertainties are unknown but norm-bounded. Based on Lyapunov stability theory and decentralized control theory of large-scale system, the design schema of decentralized parallel distributed compensation (DPDC) fuzzy controllers to ensure the asymptotic stability of the whole fuzzy large-scale system is proposed. The existence conditions for these controllers take the forms of LMIs. Finally a numerical simulation example is given to show the utility of the method proposed.展开更多
In this paper, a class of real-time parallel combined methods (RTPCM) of the digital simulation for a partitioned large system is presented. By means of combination of the parallelism across the system with the parall...In this paper, a class of real-time parallel combined methods (RTPCM) of the digital simulation for a partitioned large system is presented. By means of combination of the parallelism across the system with the parallelism across the method, stiff and non-stiff subsystems are solved in parallel on parallel computer by a parallel Rosenbrock method and a parallel RK method, respectively. Their construction, convergence and numerical stability are discussed, and the digitalsimulation experiments are conducted.展开更多
基于人类反馈的强化学习(reinforcement learning with human feedback,RLHF)作为当前大语言模型(large language models,LLMs)对齐的主流方法,其核心优化算法——近端策略优化(proximal policy optimization,PPO)却面临着显著的效率问...基于人类反馈的强化学习(reinforcement learning with human feedback,RLHF)作为当前大语言模型(large language models,LLMs)对齐的主流方法,其核心优化算法——近端策略优化(proximal policy optimization,PPO)却面临着显著的效率问题.PPO由生成、推理、训练3个相互关联的阶段组成,各个阶段有着不同的计算特性.然而,现有的RLHF并行框架采用相同并行策略顺序执行PPO的所有阶段,这导致以下2个问题:其一,生成阶段不能充分利用计算资源,进而影响整体效率;其二,阶段间严格串行执行,未能充分利用潜在并行性.针对上述问题,提出了一个新型RLHF并行框架——Pipe-RLHF.该框架能够自适应地根据各阶段的计算特征确定最优并行策略,突破现有阶段串行范式,采用异步PPO算法发掘阶段间的并行性.具体而言,创新性地提出了适用于PPO生成阶段的延迟批间流水线并行方法,显著提升了该阶段的计算资源利用率;再次,使用异步PPO解放阶段间的依赖关系,将阶段间并行应用到PPO的加速上;最后,针对PPO算法的整体优化,构建了分层并行策略空间,并提出了一套优化算法以实现该空间中的最优解搜索.通过在多个大语言模型上的性能评估实验表明,相较于现有方法,Pipe-RLHF最高可实现3.7倍的加速比,充分验证了该框架的有效性和优越性.展开更多
基于软件实现的多核系统模拟器执行计算密集/数据密集任务的时效性极差,且存在模拟精度和性能评估准确性差的不足,限制其在多核系统结构优化探索中的应用。文章提出一种周期精确的软硬件协同多核系统模拟器(cycle accurate hardware-sof...基于软件实现的多核系统模拟器执行计算密集/数据密集任务的时效性极差,且存在模拟精度和性能评估准确性差的不足,限制其在多核系统结构优化探索中的应用。文章提出一种周期精确的软硬件协同多核系统模拟器(cycle accurate hardware-software co-simulator,CAHSCS),通过在传统模拟器架构中引入硬件计算和存储模块,CAHSCS能有效改善全系统的模拟速度、精度,提高性能评估的准确性。复杂真实任务加载实验结果表明,CAHSCS将大规模复杂数据的运算效率提高了10倍,显著加快了系统设计收敛速度。展开更多
基金This project was supported by NSFC Project (60474047), (60334010) and GuangDong Province Natural Science Foundationof China(31406)and China Postdoctoral Science Foundation (20060390725).
文摘Decentralized robust stabilization problem of discrete-time fuzzy large-scale systems with parametric uncertainties is considered. This uncertain fuzzy large-scale system consists of N interconnected T-S fuzzy subsystems, and the parametric uncertainties are unknown but norm-bounded. Based on Lyapunov stability theory and decentralized control theory of large-scale system, the design schema of decentralized parallel distributed compensation (DPDC) fuzzy controllers to ensure the asymptotic stability of the whole fuzzy large-scale system is proposed. The existence conditions for these controllers take the forms of LMIs. Finally a numerical simulation example is given to show the utility of the method proposed.
文摘In this paper, a class of real-time parallel combined methods (RTPCM) of the digital simulation for a partitioned large system is presented. By means of combination of the parallelism across the system with the parallelism across the method, stiff and non-stiff subsystems are solved in parallel on parallel computer by a parallel Rosenbrock method and a parallel RK method, respectively. Their construction, convergence and numerical stability are discussed, and the digitalsimulation experiments are conducted.
文摘基于人类反馈的强化学习(reinforcement learning with human feedback,RLHF)作为当前大语言模型(large language models,LLMs)对齐的主流方法,其核心优化算法——近端策略优化(proximal policy optimization,PPO)却面临着显著的效率问题.PPO由生成、推理、训练3个相互关联的阶段组成,各个阶段有着不同的计算特性.然而,现有的RLHF并行框架采用相同并行策略顺序执行PPO的所有阶段,这导致以下2个问题:其一,生成阶段不能充分利用计算资源,进而影响整体效率;其二,阶段间严格串行执行,未能充分利用潜在并行性.针对上述问题,提出了一个新型RLHF并行框架——Pipe-RLHF.该框架能够自适应地根据各阶段的计算特征确定最优并行策略,突破现有阶段串行范式,采用异步PPO算法发掘阶段间的并行性.具体而言,创新性地提出了适用于PPO生成阶段的延迟批间流水线并行方法,显著提升了该阶段的计算资源利用率;再次,使用异步PPO解放阶段间的依赖关系,将阶段间并行应用到PPO的加速上;最后,针对PPO算法的整体优化,构建了分层并行策略空间,并提出了一套优化算法以实现该空间中的最优解搜索.通过在多个大语言模型上的性能评估实验表明,相较于现有方法,Pipe-RLHF最高可实现3.7倍的加速比,充分验证了该框架的有效性和优越性.
文摘基于软件实现的多核系统模拟器执行计算密集/数据密集任务的时效性极差,且存在模拟精度和性能评估准确性差的不足,限制其在多核系统结构优化探索中的应用。文章提出一种周期精确的软硬件协同多核系统模拟器(cycle accurate hardware-software co-simulator,CAHSCS),通过在传统模拟器架构中引入硬件计算和存储模块,CAHSCS能有效改善全系统的模拟速度、精度,提高性能评估的准确性。复杂真实任务加载实验结果表明,CAHSCS将大规模复杂数据的运算效率提高了10倍,显著加快了系统设计收敛速度。