[2411.05979] Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits

[Submitted on 8 Nov 2024 (v1), last revised 11 Mar 2025 (this version, v2)]

View a PDF of the paper titled Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits, by Ha Manh Bui and 2 other authors

View PDF

Abstract:By leveraging the representation power of deep neural networks, neural upper confidence bound (UCB) algorithms have shown success in contextual bandits. To further balance the exploration and exploitation, we propose Neural-$\sigma^2$-LinearUCB, a variance-aware algorithm that utilizes $\sigma^2_t$, i.e., an upper bound of the reward noise variance at round $t$, to enhance the uncertainty quantification quality of the UCB, resulting in a regret performance improvement. We provide an oracle version for our algorithm characterized by an oracle variance upper bound $\sigma^2_t$ and a practical version with a novel estimation for this variance bound. Theoretically, we provide rigorous regret analysis for both versions and prove that our oracle algorithm achieves a better regret guarantee than other neural-UCB algorithms in the neural contextual bandits setting. Empirically, our practical method enjoys a similar computational efficiency, while outperforming state-of-the-art techniques by having a better calibration and lower regret across multiple standard settings, including on the synthetic, UCI, MNIST, and CIFAR-10 datasets.

Submission history

From: Ha Manh Bui [view email]
[v1]
Fri, 8 Nov 2024 21:24:14 UTC (13,768 KB)
[v2]
Tue, 11 Mar 2025 02:32:48 UTC (14,816 KB)

Source link

Latest articles

[2411.05979] Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits

Submission history

Latest articles

ChatGPT gained one million new users in an hour today

China police deploy real-life Robocop as humanoid tech takes huge leap forward

Runway releases Gen-4 video model with focus on consistency

Leave a Comment Cancel reply

Featured articles

ChatGPT gained one million new users in an hour today

China police deploy real-life Robocop as humanoid tech takes huge leap forward

Runway releases Gen-4 video model with focus on consistency