摘要
According to the requirements of the live-virtual-constructive(LVC)tactical confrontation(TC)on the virtual entity(VE)decision model of graded combat capability,diversified actions,real-time decision-making,and generalization for the enemy,the confrontation process is modeled as a zero-sum stochastic game(ZSG).By introducing the theory of dynamic relative power potential field,the problem of reward sparsity in the model can be solved.By reward shaping,the problem of credit assignment between agents can be solved.Based on the idea of meta-learning,an extensible multi-agent deep reinforcement learning(EMADRL)framework and solving method is proposed to improve the effectiveness and efficiency of model solving.Experiments show that the model meets the requirements well and the algorithm learning efficiency is high.
基金
supported by the Military Scentific Research Project(41405030302,41401020301).
作者简介
GAO Ang was born in 1988.He received his Ph.D.degree in science of military equipemnt from Army Academy of Armored Forces.He is a Ph.D.candidate in Army Academy of Armored Forces.His research interest is intelligent decision of computer generated force based on multi-agent deep reinforcement learning.E-mail:15689783388@163.com;GUO Qisheng was born in 1962.He received his Ph.D.degree in science of military equipemnt from Tsinghua University.His research interests are equipment requirement demonstration and equipment test.E-mail:236211566@qq.com;Corresponding author:DONG Zhiming was born in 1977.He received his Ph.D.degree in science of military equipemnt from Army Academy of Armored Forces.His research interests are equipment requirement demonstration and equipment test.E-mail:dong_zhiming@163.com;TANG Zaijiang was born in 1976.He received his Ph.D.degree in science of military equipemnt from Army Academy of Armored Forces.His research interest is battle simulation.E-mail:tangzaijiang@sina.com;ZHANG Ziwei was born in 1986.He received his Ph.D.degree in science of military equipemnt from Army Academy of Armored Forces.He is a Ph.D.candidate in Army Academy of Armored Forces.His research interest is equipment test evaluation.E-mail:gaoang370829@sohu.com;FENG Qiqi was born in 1992.She received her M.S.degree in science of military equipemnt form Army Academy of Armored Forces.She is pursuing her Ph.D.degree in Army Academy of Armored Forces.Her research interest is real-time research of live virtual constructive.E-mail:594472717@qq.com。