TY - GEN
T1 - Service function chaining in NFV-enabled edge networks with natural actor-critic deep reinforcement learning
AU - Wang, Ruijie
AU - Li, Junhuai
AU - Wang, Kan
AU - Liu, Xuan
AU - Li, Xuan
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/7/28
Y1 - 2021/7/28
N2 - In this paper, by exploiting the natural policy gradient-based actor-critic framework, we study the service function chaining in network function virtualization (NFV)-enabled edge networks. First, a long-run function chaining problem is formulated to minimize the end-to-end service latency, involving not only the server and wired link resources, but also radio resource in wireless links; the Markov decision process (MDP) model is further leveraged to capture dynamics in both server and radio resources, whereby the transition probability over state space is explicitly derived. Second, a natural actor-critic framework is presented, which utilizes natural policy gradient to train the deep neural network (DNN), thereby avoiding trapped into the local optimum. In particular, to overcome the high-dimensionally issue in action space, we further resort to one integer linear programming (ILP) formulation, reducing the space size from cube to first power. Finally, simulations are conducted to demonstrate the effectiveness of proposed approach, revealing that the latency minimization could benefit from the learning in not only service function chain (SFC) routing across edge servers, but also radio resource allocation in wireless links.
AB - In this paper, by exploiting the natural policy gradient-based actor-critic framework, we study the service function chaining in network function virtualization (NFV)-enabled edge networks. First, a long-run function chaining problem is formulated to minimize the end-to-end service latency, involving not only the server and wired link resources, but also radio resource in wireless links; the Markov decision process (MDP) model is further leveraged to capture dynamics in both server and radio resources, whereby the transition probability over state space is explicitly derived. Second, a natural actor-critic framework is presented, which utilizes natural policy gradient to train the deep neural network (DNN), thereby avoiding trapped into the local optimum. In particular, to overcome the high-dimensionally issue in action space, we further resort to one integer linear programming (ILP) formulation, reducing the space size from cube to first power. Finally, simulations are conducted to demonstrate the effectiveness of proposed approach, revealing that the latency minimization could benefit from the learning in not only service function chain (SFC) routing across edge servers, but also radio resource allocation in wireless links.
KW - Actor-critic
KW - Edge networks
KW - Natural policy gradient
KW - Radio resource
KW - Server resource
UR - https://www.scopus.com/pages/publications/85119370607
U2 - 10.1109/ICCC52777.2021.9580255
DO - 10.1109/ICCC52777.2021.9580255
M3 - 会议稿件
AN - SCOPUS:85119370607
T3 - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021
SP - 1095
EP - 1100
BT - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE/CIC International Conference on Communications in China, ICCC 2021
Y2 - 28 July 2021 through 30 July 2021
ER -