TY - GEN
T1 - Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning
AU - Cai, Tianchi
AU - Jiang, Jiyan
AU - Zhang, Wenpeng
AU - Zhou, Shiji
AU - Song, Xierui
AU - Yu, Li
AU - Gu, Lihong
AU - Zeng, Xiaodong
AU - Gu, Jinjie
AU - Zhang, Guannan
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/2/27
Y1 - 2023/2/27
N2 - We study the budget allocation problem in online marketing campaigns that utilize previously collected offline data. We first discuss the long-term effect of optimizing marketing budget allocation decisions in the offline setting. To overcome the challenge, we propose a novel game-theoretic offline value-based reinforcement learning method using mixed policies. The proposed method reduces the need to store infinitely many policies in previous methods to only constantly many policies, which achieves nearly optimal policy efficiency, making it practical and favorable for industrial usage. We further show that this method is guaranteed to converge to the optimal policy, which cannot be achieved by previous value-based reinforcement learning methods for marketing budget allocation. Our experiments on a large-scale marketing campaign with tens-of-millions users and more than one billion budget verify the theoretical results and show that the proposed method outperforms various baseline methods. The proposed method has been successfully deployed to serve all the traffic of this marketing campaign.
AB - We study the budget allocation problem in online marketing campaigns that utilize previously collected offline data. We first discuss the long-term effect of optimizing marketing budget allocation decisions in the offline setting. To overcome the challenge, we propose a novel game-theoretic offline value-based reinforcement learning method using mixed policies. The proposed method reduces the need to store infinitely many policies in previous methods to only constantly many policies, which achieves nearly optimal policy efficiency, making it practical and favorable for industrial usage. We further show that this method is guaranteed to converge to the optimal policy, which cannot be achieved by previous value-based reinforcement learning methods for marketing budget allocation. Our experiments on a large-scale marketing campaign with tens-of-millions users and more than one billion budget verify the theoretical results and show that the proposed method outperforms various baseline methods. The proposed method has been successfully deployed to serve all the traffic of this marketing campaign.
KW - marketing budget allocation
KW - offline constrained deep RL
UR - https://www.scopus.com/pages/publications/85149651325
U2 - 10.1145/3539597.3570486
DO - 10.1145/3539597.3570486
M3 - 会议稿件
AN - SCOPUS:85149651325
T3 - WSDM 2023 - Proceedings of the 16th ACM International Conference on Web Search and Data Mining
SP - 186
EP - 194
BT - WSDM 2023 - Proceedings of the 16th ACM International Conference on Web Search and Data Mining
PB - Association for Computing Machinery, Inc
T2 - 16th ACM International Conference on Web Search and Data Mining, WSDM 2023
Y2 - 27 February 2023 through 3 March 2023
ER -