Discrete Upper Confidence Bound Algorithm in HOL

This is a development version of this entry. It might change over time and is not stable. Please refer to release versions for citations.

Abstract

This project formally verifies the Discrete Upper Confidence Bound (UCB) algorithm in Isabelle/Higher-order Logic (HOL), focusing on its probabilistic guarantees and regret bounds. The work extends Isabelle/HOLs probabilistic framework and explores verification of discrete-time bandit models following [1]. This research advances the formal verification of probabilistic algorithms in reinforcement learning.

License

BSD License

Topics

Computer science/Machine learning

Session Discrete-UCB

MSc_Project_Discrete_Prop15_1
Discrete_UCB_Step1
Discrete_UCB_Step2
Discrete_UCB_Step3

Abstract

License

Topics

Session Discrete-UCB

Cite

Download