|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
UCB-SC: A Fast Variant of KL-UCB-SC for Budgeted Multi-Armed Bandit Problem
Ryo WATANABE Junpei KOMIYAMA Atsuyoshi NAKAMURA Mineichi KUDO
Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Vol.E101-A
No.3
pp.662-667 Publication Date: 2018/03/01 Online ISSN: 1745-1337
DOI: 10.1587/transfun.E101.A.662 Type of Manuscript: LETTER Category: Mathematical Systems Science Keyword: budgeted multi-armed bandits, regret analysis, upper confidence bound,
Full Text: PDF(967.8KB)>>
Summary:
We propose a policy UCB-SC for budgeted multi-armed bandits. The policy is a variant of recently proposed KL-UCB-SC. Unlike KL-UCB-SC, which is computationally prohibitive, UCB-SC runs very fast while keeping KL-UCB-SC's asymptotical optimality when reward and cost distributions are Bernoulli with means around 0.5, which are verified both theoretically and empirically.
|
|