Graph reinforcement learning with relational priors for predictive power allocation

Research output: Contribution to journalArticlepeer-review

Abstract

Deep reinforcement learning for resource allocation has been investigated extensively owing to its ability of handling model-free and end-to-end problems. However, its slow convergence and high time complexity during online training hinder its practical use in dynamic wireless systems. To reduce the training complexity, we resort to graph reinforcement learning for leveraging two kinds of relational priors inherent in many wireless communication problems: topology information and permutation properties. To harness the two priors, we first conceive a method to convert the state matrix into a state graph, and then propose a graph deep deterministic policy gradient (DDPG) algorithm with the desired permutation property. To demonstrate how to apply the proposed methods, we consider a representative problem of using reinforcement learning, predictive power allocation, which minimizes the energy consumption while ensuring the quality-of-service of each user requesting video streaming. We derive the time complexity required by training the proposed graph DDPG algorithm and fully-connected neural network-based DDPG algorithm in each time step. Simulations show that the graph DDPG algorithm converges much faster and needs much lower time and space complexity than existing DDPG algorithms to achieve the same learning performance.

Original languageEnglish
Article number122302
JournalScience China Information Sciences
Volume68
Issue number2
DOIs
StatePublished - Feb 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • graph neural network
  • reinforcement learning
  • relational priors
  • resource allocation

Fingerprint

Dive into the research topics of 'Graph reinforcement learning with relational priors for predictive power allocation'. Together they form a unique fingerprint.

Cite this