跳到主要导航 跳到搜索 跳到主要内容

Bi-level Hierarchical Neural Contextual Bandits for Online Recommendation

  • Yunzhe Qi*
  • , Yao Zhou*
  • , Yikun Ban
  • , Allan Stewart
  • , Chuanwei Ruan*
  • , Jiachuan He
  • , Shishir Kumar Prasad
  • , Haixun Wang
  • , Jingrui He*
  • *此作品的通讯作者

科研成果: 期刊稿件评论/辩论

摘要

Contextual bandit algorithms aim to identify the optimal choice among a set of candidate arms, based on their contextual information. Among others, neural contextual bandit algorithms have demonstrated generally superior performance compared to conventional linear and kernel-based methods. Nevertheless, neural methods can be inherently unsuitable for handling a large number of candidate arms due to their high computational cost when performing principled exploration. Motivated by the widespread availability of arm category information (e.g., movie genres, retailer types), we formulate contextual bandits as a bi-level online recommendation problem, and propose a novel neural bandit framework, named H2 N-Bandit, which utilizes a bi-level hierarchical neural architecture to mitigate the substantial computational cost found in conventional neural bandit methods. To demonstrate its theoretical effectiveness, we provide regret analysis under general over-parameterization settings, along with a guarantee for category-level recommendation. To illustrate its effectiveness and efficiency, we conduct extensive experiments on multiple real-world data sets, highlighting that H2 N-Bandit can significantly reduce the computational cost over existing strong non-linear baselines, while achieving better or comparable performance under online recommendation settings.

源语言英语
期刊Transactions on Machine Learning Research
2026 January
出版状态已出版 - 2026
已对外发布

指纹

探究 'Bi-level Hierarchical Neural Contextual Bandits for Online Recommendation' 的科研主题。它们共同构成独一无二的指纹。

引用此