跳到主要导航 跳到搜索 跳到主要内容

Robust Neural Contextual Bandit against Adversarial Corruptions

  • Yunzhe Qi
  • , Yikun Ban
  • , Arindam Banerjee
  • , Jingrui He
  • University of Illinois at Urbana-Champaign

科研成果: 期刊稿件会议文章同行评审

摘要

Contextual bandit algorithms aim to identify the optimal arm with the highest reward among a set of candidates, based on the accessible contextual information. Among these algorithms, neural contextual bandit methods have shown generally superior performances against linear and kernel ones, due to the representation power of neural networks. However, similar to other neural network applications, neural bandit algorithms can be vulnerable to adversarial attacks or corruptions on the received labels (i.e., arm rewards), which can lead to unexpected performance degradation without proper treatments. As a result, it is necessary to improve the robustness of neural bandit models against potential reward corruptions. In this work, we propose a novel neural contextual bandit algorithm named R-NeuralUCB, which utilizes a novel context-aware Gradient Descent (GD) training strategy to improve the robustness against adversarial reward corruptions. Under over-parameterized neural network settings, we provide regret analysis for R-NeuralUCB to quantify reward corruption impacts, without the commonly adopted arm separateness assumption in existing neural bandit works. We also conduct experiments against baselines on real data sets under different scenarios, in order to demonstrate the effectiveness of our proposed R-NeuralUCB.

源语言英语
期刊Advances in Neural Information Processing Systems
37
出版状态已出版 - 2024
已对外发布
活动38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, 加拿大
期限: 9 12月 202415 12月 2024

指纹

探究 'Robust Neural Contextual Bandit against Adversarial Corruptions' 的科研主题。它们共同构成独一无二的指纹。

引用此