Discrete Upper Confidence Bound Algorithm in HOL

Arjan Faber 📧

October 25, 2025

This is a development version of this entry. It might change over time and is not stable. Please refer to release versions for citations.

Abstract

This project formally verifies the Discrete Upper Confidence Bound (UCB) algorithm in Isabelle/Higher-order Logic (HOL), focusing on its probabilistic guarantees and regret bounds. The work extends Isabelle/HOLs probabilistic framework and explores verification of discrete-time bandit models following [1]. This research advances the formal verification of probabilistic algorithms in reinforcement learning.

License

BSD License

Topics

Session Discrete-UCB