The trajectory optimization of an unpowered reentry vehicle via artificial emotion memory optimization(AEMO)is discussed.Firstly,reentry dynamics are established based on multiple constraints and parameterized control...The trajectory optimization of an unpowered reentry vehicle via artificial emotion memory optimization(AEMO)is discussed.Firstly,reentry dynamics are established based on multiple constraints and parameterized control variables with finite dimensions are designed.If the constraint is not satisfied,a distance measure and an adaptive penalty function are used to address this scenario.Secondly,AEMO is introduced to solve the trajectory optimization problem.Based on the theories of biology and cognition,the trial solutions based on emotional memory are established.Three search strategies are designed for realizing the random search of trial solutions and for avoiding becoming trapped in a local minimum.The states of the trial solutions are determined according to the rules of memory enhancement and forgetting.As the iterations proceed,the trial solutions with poor quality will gradually be forgotten.Therefore,the number of trial solutions is decreased,and the convergence of the algorithm is accelerated.Finally,a numerical simulation is conducted,and the results demonstrate that the path and terminal constraints are satisfied and the method can realize satisfactory performance.展开更多
As a large amount of data is increasingly generated from edge devices,such as smart homes,mobile phones,and wearable devices,it becomes crucial for many applications to deploy machine learning modes across edge device...As a large amount of data is increasingly generated from edge devices,such as smart homes,mobile phones,and wearable devices,it becomes crucial for many applications to deploy machine learning modes across edge devices.The execution speed of the deployed model is a key element to ensure service quality.Considering a highly heterogeneous edge deployment scenario,deep learning compiling is a novel approach that aims to solve this problem.It defines models using certain DSLs and generates efficient code implementations on different hardware devices.However,there are still two aspects that are not yet thoroughly investigated yet.The first is the optimization of memory-intensive operations,and the second problem is the heterogeneity of the deployment target.To that end,in this work,we propose a system solution that optimizes memory-intensive operation,optimizes the subgraph distribution,and enables the compiling and deployment of DNN models on multiple targets.The evaluation results show the performance of our proposed system.展开更多
Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at t...Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at this problem,a parallelization approach was proposed with six memory optimization schemes for CG,four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20,the parallelization approach can reach up to 21 and 133 times speedups with size A and B,respectively,compared with single power processor element. Finally,the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV,simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores.展开更多
基金supported by the Defense Science and Technology Key Laboratory Fund of Luoyang Electro-optical Equipment Institute,Aviation Industry Corporation of China(6142504200108).
文摘The trajectory optimization of an unpowered reentry vehicle via artificial emotion memory optimization(AEMO)is discussed.Firstly,reentry dynamics are established based on multiple constraints and parameterized control variables with finite dimensions are designed.If the constraint is not satisfied,a distance measure and an adaptive penalty function are used to address this scenario.Secondly,AEMO is introduced to solve the trajectory optimization problem.Based on the theories of biology and cognition,the trial solutions based on emotional memory are established.Three search strategies are designed for realizing the random search of trial solutions and for avoiding becoming trapped in a local minimum.The states of the trial solutions are determined according to the rules of memory enhancement and forgetting.As the iterations proceed,the trial solutions with poor quality will gradually be forgotten.Therefore,the number of trial solutions is decreased,and the convergence of the algorithm is accelerated.Finally,a numerical simulation is conducted,and the results demonstrate that the path and terminal constraints are satisfied and the method can realize satisfactory performance.
基金supported by the National Natural Science Foundation of China(U21A20519)。
文摘As a large amount of data is increasingly generated from edge devices,such as smart homes,mobile phones,and wearable devices,it becomes crucial for many applications to deploy machine learning modes across edge devices.The execution speed of the deployed model is a key element to ensure service quality.Considering a highly heterogeneous edge deployment scenario,deep learning compiling is a novel approach that aims to solve this problem.It defines models using certain DSLs and generates efficient code implementations on different hardware devices.However,there are still two aspects that are not yet thoroughly investigated yet.The first is the optimization of memory-intensive operations,and the second problem is the heterogeneity of the deployment target.To that end,in this work,we propose a system solution that optimizes memory-intensive operation,optimizes the subgraph distribution,and enables the compiling and deployment of DNN models on multiple targets.The evaluation results show the performance of our proposed system.
基金Project(2008AA01A201) supported the National High-tech Research and Development Program of ChinaProjects(60833004, 60633050) supported by the National Natural Science Foundation of China
文摘Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at this problem,a parallelization approach was proposed with six memory optimization schemes for CG,four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20,the parallelization approach can reach up to 21 and 133 times speedups with size A and B,respectively,compared with single power processor element. Finally,the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV,simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores.