跳到主要导航 跳到搜索 跳到主要内容

Understanding Extortion and Fairness in Iterated Prisoner's Dilemma Through Actor-Critic Learning Dynamics

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Iterated Prisoner's Dilemma (IPD) is widely used to investigate the emergence and stability of cooperative behavior in both social and biological systems. Through repeated interactions, cooperation can be cultivated, enhancing the long-term mutual benefit a principle fundamental to direct reciprocity. However, this mechanism also permits extortionate Zero-Determinant (ZD) strategies to exploit Always Cooperate (ALLC) and other cooperative strategies. In this study, we examine the learning dynamics of actor-critic agents when confronted with extortionate ZD strategies in IPD. We analyze the learning process of actor-critic agents in both stochastic and deterministic settings, exploring the condition under which cooperation can emerge and persist in the presence of extortionate ZD strategies. Furthermore, we scrutinize the balance between fairness and exploitation, examining how extortionate ZD strategies can maximize their rewards without destabilizing cooperative equilibria. Our results offer valuable insights into the reinforcement learning dynamics in the context of IPD and illuminate the interplay between evolutionary game theory and reinforcement learning in understanding the emergence of cooperation, paving the way for developing resilient, cooperative multi-agent systems.

源语言英语
主期刊名Proceedings of the 44th Chinese Control Conference, CCC 2025
编辑Jian Sun, Hongpeng Yin
出版商IEEE Computer Society
8410-8415
页数6
ISBN(电子版)9789887581611
DOI
出版状态已出版 - 2025
活动44th Chinese Control Conference, CCC 2025 - Chongqing, 中国
期限: 28 7月 202530 7月 2025

出版系列

姓名Chinese Control Conference, CCC
ISSN(印刷版)1934-1768
ISSN(电子版)2161-2927

会议

会议44th Chinese Control Conference, CCC 2025
国家/地区中国
Chongqing
时期28/07/2530/07/25

指纹

探究 'Understanding Extortion and Fairness in Iterated Prisoner's Dilemma Through Actor-Critic Learning Dynamics' 的科研主题。它们共同构成独一无二的指纹。

引用此