Peer-to-peer computation offloading has been a promising approach that enables resourcelimited Internet of Things(IoT)devices to offload their computation-intensive tasks to idle peer devices in proximity.Different fr...Peer-to-peer computation offloading has been a promising approach that enables resourcelimited Internet of Things(IoT)devices to offload their computation-intensive tasks to idle peer devices in proximity.Different from dedicated servers,the spare computation resources offered by peer devices are random and intermittent,which affects the offloading performance.The mutual interference caused by multiple simultaneous offloading requestors that share the same wireless channel further complicates the offloading decisions.In this work,we investigate the opportunistic peer-to-peer task offloading problem by jointly considering the stochastic task arrivals,dynamic interuser interference,and opportunistic availability of peer devices.Each requestor makes decisions on both local computation frequency and offloading transmission power to minimize its own expected long-term cost on tasks completion,which takes into consideration its energy consumption,task delay,and task loss due to buffer overflow.The dynamic decision process among multiple requestors is formulated as a stochastic game.By constructing the post-decision states,a decentralized online offloading algorithm is proposed,where each requestor as an independent learning agent learns to approach the optimal strategies with its local observations.Simulation results under different system parameter configurations demonstrate the proposed online algorithm achieves a better performance compared with some existing algorithms,especially in the scenarios with large task arrival probability or small helper availability probability.展开更多
In this study, aiming at the characteristics of randomness and dynamics in Wearable Audiooriented BodyNets (WA-BodyNets), stochastic differential game theory is applied to the investigation of the problem of transm...In this study, aiming at the characteristics of randomness and dynamics in Wearable Audiooriented BodyNets (WA-BodyNets), stochastic differential game theory is applied to the investigation of the problem of transmitted power control inconsumer electronic devices. First, astochastic differential game model is proposed for non-cooperative decentralized uplink power control with a wisdom regulation factor over WA-BodyNets with a onehop star topology.This model aims to minimize the cost associated with the novel payoff function of a player, for which two cost functions are defined: functions of inherent power radiation and accumulated power radiation darmge. Second, the feedback Nash equilibrium solution of the proposed model and the constraint of the Quality of Service (QoS) requirement of the player based on the SIR threshold are derived by solving the Fleming-Bellman-Isaacs partial differential equations. Furthermore, the Markov property of the optimal feedback strategies in this model is verified.The simulation results show that the proposed game model is effective and feasible for controlling the transmitted power of WA-BodyNets.展开更多
A necessary maximum principle is given for nonzero-sum stochastic Oltterential games with random jumps. The result is applied to solve the H2/H∞ control problem of stochastic systems with random jumps. A necessary an...A necessary maximum principle is given for nonzero-sum stochastic Oltterential games with random jumps. The result is applied to solve the H2/H∞ control problem of stochastic systems with random jumps. A necessary and sufficient condition for the existence of a unique solution to the H2/H∞ control problem is derived. The resulting solution is given by the solution of an uncontrolled forward backward stochastic differential equation with random jumps.展开更多
Existing researches on cyber attackdefense analysis have typically adopted stochastic game theory to model the problem for solutions,but the assumption of complete rationality is used in modeling,ignoring the informat...Existing researches on cyber attackdefense analysis have typically adopted stochastic game theory to model the problem for solutions,but the assumption of complete rationality is used in modeling,ignoring the information opacity in practical attack and defense scenarios,and the model and method lack accuracy.To such problem,we investigate network defense policy methods under finite rationality constraints and propose network defense policy selection algorithm based on deep reinforcement learning.Based on graph theoretical methods,we transform the decision-making problem into a path optimization problem,and use a compression method based on service node to map the network state.On this basis,we improve the A3C algorithm and design the DefenseA3C defense policy selection algorithm with online learning capability.The experimental results show that the model and method proposed in this paper can stably converge to a better network state after training,which is faster and more stable than the original A3C algorithm.Compared with the existing typical approaches,Defense-A3C is verified its advancement.展开更多
The application of unmanned aerial vehicle(UAV)-mounted base stations is emerging as an effective solution to provide wireless communication service for a target region containing some smart objects(SOs)in internet of...The application of unmanned aerial vehicle(UAV)-mounted base stations is emerging as an effective solution to provide wireless communication service for a target region containing some smart objects(SOs)in internet of things(IoT).This paper investigates the efficient deployment problem of multiple UAVs for IoT communication in dynamic environment.We first define a measurement of communication performance of UAVto-SO in the target region which is regarded as the optimization objective.The state of one SO is active when it needs to transmit or receive the data;otherwise,silent.The switch of two different states is implemented with a certain probability that results in a dynamic communication environment.In the dynamic environment,the active states of SOs cannot be known by UAVs in advance and only neighbouring UAVs can communicate with each other.To overcome these challenges in the deployment,we leverage a game-theoretic learning approach to solve the position-selected problem.This problem is modeled a stochastic game,which is proven that it is an exact potential game and exists the best Nash equilibria(NE).Furthermore,a distributed position optimization algorithm is proposed,which can converge to a pure-strategy NE.Numerical results demonstrate the excellent performance of our proposed algorithm.展开更多
基金supported by National Natural Science Foundation of China (No. 62101601)
文摘Peer-to-peer computation offloading has been a promising approach that enables resourcelimited Internet of Things(IoT)devices to offload their computation-intensive tasks to idle peer devices in proximity.Different from dedicated servers,the spare computation resources offered by peer devices are random and intermittent,which affects the offloading performance.The mutual interference caused by multiple simultaneous offloading requestors that share the same wireless channel further complicates the offloading decisions.In this work,we investigate the opportunistic peer-to-peer task offloading problem by jointly considering the stochastic task arrivals,dynamic interuser interference,and opportunistic availability of peer devices.Each requestor makes decisions on both local computation frequency and offloading transmission power to minimize its own expected long-term cost on tasks completion,which takes into consideration its energy consumption,task delay,and task loss due to buffer overflow.The dynamic decision process among multiple requestors is formulated as a stochastic game.By constructing the post-decision states,a decentralized online offloading algorithm is proposed,where each requestor as an independent learning agent learns to approach the optimal strategies with its local observations.Simulation results under different system parameter configurations demonstrate the proposed online algorithm achieves a better performance compared with some existing algorithms,especially in the scenarios with large task arrival probability or small helper availability probability.
基金the National Natural Science Foundation of China under Grants No.61272506,No.61170014,the Foundation of Key Program of MOE of China under Grant No.311007,the Natural Science Foundation of Beijing under Grant No.4102041
文摘In this study, aiming at the characteristics of randomness and dynamics in Wearable Audiooriented BodyNets (WA-BodyNets), stochastic differential game theory is applied to the investigation of the problem of transmitted power control inconsumer electronic devices. First, astochastic differential game model is proposed for non-cooperative decentralized uplink power control with a wisdom regulation factor over WA-BodyNets with a onehop star topology.This model aims to minimize the cost associated with the novel payoff function of a player, for which two cost functions are defined: functions of inherent power radiation and accumulated power radiation darmge. Second, the feedback Nash equilibrium solution of the proposed model and the constraint of the Quality of Service (QoS) requirement of the player based on the SIR threshold are derived by solving the Fleming-Bellman-Isaacs partial differential equations. Furthermore, the Markov property of the optimal feedback strategies in this model is verified.The simulation results show that the proposed game model is effective and feasible for controlling the transmitted power of WA-BodyNets.
基金supported by the Doctoral foundation of University of Jinan(XBS1213)the National Natural Science Foundation of China(11101242)
文摘A necessary maximum principle is given for nonzero-sum stochastic Oltterential games with random jumps. The result is applied to solve the H2/H∞ control problem of stochastic systems with random jumps. A necessary and sufficient condition for the existence of a unique solution to the H2/H∞ control problem is derived. The resulting solution is given by the solution of an uncontrolled forward backward stochastic differential equation with random jumps.
基金supported by the Major Science and Technology Programs in Henan Province(No.241100210100)The Project of Science and Technology in Henan Province(No.242102211068,No.232102210078)+2 种基金The Key Field Special Project of Guangdong Province(No.2021ZDZX1098)The China University Research Innovation Fund(No.2021FNB3001,No.2022IT020)Shenzhen Science and Technology Innovation Commission Stable Support Plan(No.20231128083944001)。
文摘Existing researches on cyber attackdefense analysis have typically adopted stochastic game theory to model the problem for solutions,but the assumption of complete rationality is used in modeling,ignoring the information opacity in practical attack and defense scenarios,and the model and method lack accuracy.To such problem,we investigate network defense policy methods under finite rationality constraints and propose network defense policy selection algorithm based on deep reinforcement learning.Based on graph theoretical methods,we transform the decision-making problem into a path optimization problem,and use a compression method based on service node to map the network state.On this basis,we improve the A3C algorithm and design the DefenseA3C defense policy selection algorithm with online learning capability.The experimental results show that the model and method proposed in this paper can stably converge to a better network state after training,which is faster and more stable than the original A3C algorithm.Compared with the existing typical approaches,Defense-A3C is verified its advancement.
基金supported in part by the Natural Science Foundation of China under Grants 61801243, 61671144, and 61971238by the China Postdoctoral Science Foundation under Grant 2019M651914+1 种基金by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant 18KJB510026by the Foundation of Nanjing University of Posts and Telecommunications under Grant NY218124
文摘The application of unmanned aerial vehicle(UAV)-mounted base stations is emerging as an effective solution to provide wireless communication service for a target region containing some smart objects(SOs)in internet of things(IoT).This paper investigates the efficient deployment problem of multiple UAVs for IoT communication in dynamic environment.We first define a measurement of communication performance of UAVto-SO in the target region which is regarded as the optimization objective.The state of one SO is active when it needs to transmit or receive the data;otherwise,silent.The switch of two different states is implemented with a certain probability that results in a dynamic communication environment.In the dynamic environment,the active states of SOs cannot be known by UAVs in advance and only neighbouring UAVs can communicate with each other.To overcome these challenges in the deployment,we leverage a game-theoretic learning approach to solve the position-selected problem.This problem is modeled a stochastic game,which is proven that it is an exact potential game and exists the best Nash equilibria(NE).Furthermore,a distributed position optimization algorithm is proposed,which can converge to a pure-strategy NE.Numerical results demonstrate the excellent performance of our proposed algorithm.