TY - JOUR
T1 - CLlight
T2 - Enhancing representation of multi-agent reinforcement learning with contrastive learning for cooperative traffic signal control
AU - Fu, Xiang
AU - Ren, Yilong
AU - Jiang, Han
AU - Lv, Jiancheng
AU - Cui, Zhiyong
AU - Yu, Haiyang
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/3/1
Y1 - 2025/3/1
N2 - Multi-agent reinforcement learning has shown great potential for coordinating multi-intersection traffic signals due to its powerful adaptive capabilities, treating each intersection as an agent. However, in the real world, different intersections possess differentiating characteristics such as unique vehicle distributions and traffic patterns. Most existing methods directly add neighboring intersection states to local intersections and optimize the cooperative policy network based on synthesized global features. This indirect optimization approach makes it difficult to thoroughly explore the mutual interactions among different intersection agents, preventing agents from truly learning features with cooperative awareness. To resolve these challenges, we introduce contrastive learning as representation task to the multi-intersection traffic signal control approach named CLlight for two-stage policy network updating. In the first stage, we utilize policy-based or actor-critic-based reinforcement learning methods such as A2C, SAC, and PPO to train policy networks with certain representational capabilities. In the second stage, by extracting pre- and post-masked features and reconstructing the post-masked features, the agents are encouraged to learn the similarities and differences between different intersection policies, which in turn enhances the cooperative and individual representation capabilities of the policy network. To the best of our knowledge, this is the first application of contrastive learning in the field of traffic signal control. Experimental results demonstrate, compared to other state-of-the-art traffic signal control methods, superior average travel time and average waiting time performance under various scenarios, tested on synthetic and real-world datasets.
AB - Multi-agent reinforcement learning has shown great potential for coordinating multi-intersection traffic signals due to its powerful adaptive capabilities, treating each intersection as an agent. However, in the real world, different intersections possess differentiating characteristics such as unique vehicle distributions and traffic patterns. Most existing methods directly add neighboring intersection states to local intersections and optimize the cooperative policy network based on synthesized global features. This indirect optimization approach makes it difficult to thoroughly explore the mutual interactions among different intersection agents, preventing agents from truly learning features with cooperative awareness. To resolve these challenges, we introduce contrastive learning as representation task to the multi-intersection traffic signal control approach named CLlight for two-stage policy network updating. In the first stage, we utilize policy-based or actor-critic-based reinforcement learning methods such as A2C, SAC, and PPO to train policy networks with certain representational capabilities. In the second stage, by extracting pre- and post-masked features and reconstructing the post-masked features, the agents are encouraged to learn the similarities and differences between different intersection policies, which in turn enhances the cooperative and individual representation capabilities of the policy network. To the best of our knowledge, this is the first application of contrastive learning in the field of traffic signal control. Experimental results demonstrate, compared to other state-of-the-art traffic signal control methods, superior average travel time and average waiting time performance under various scenarios, tested on synthetic and real-world datasets.
KW - Contrastive learning
KW - Masking strategy
KW - Multi-agent reinforcement learning
KW - Traffic signal control
UR - https://www.scopus.com/pages/publications/85208107920
U2 - 10.1016/j.eswa.2024.125578
DO - 10.1016/j.eswa.2024.125578
M3 - 文章
AN - SCOPUS:85208107920
SN - 0957-4174
VL - 262
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 125578
ER -