Testing coherence and identifying winners in dueling bandits: theory and algorithms / written by Björn Haddenhorst ; supervised by Prof. Dr. Eyke Hüllermeier. Paderborn, 2023
Inhalt
- List of Publications
- List of Symbols
- List of Abbreviations
- List of Algorithms
- Introduction
- The (Multi-)Dueling Bandits Setting
- Modeling Assumptions in (Multi-)Dueling Bandits
- Outline and Contribution of this Thesis
- Related Work
- Notation and Conventions
- Fundamentals
- Probabilistic Prerequisites
- Concentration Inequalities and Convergence Results
- Sequentially Testing for the Mode of a Biased Coin
- Sequentially Testing for the Mode of a Biased Die
- Empirical Evaluation
- A Change-of-Measure Argument
- Discussion and Related Work
- Graph-Theoretical Considerations
- Learning Problems in Dueling Bandits
- Testification for the Existence of a Condorcet Winner
- Lower Bounds
- Solutions for CW Testification
- Proof of Theorem 4.6
- Solutions to Other CW-Related Problems
- A Reduction to Pure Exploration Multi-Armed Bandits
- Empirical Evaluation
- Discussion and Related Work
- Testing for Stochastic Transitivity
- Impossibility Results for Several Types of Stochastic Transitivity
- Lower Bounds for WST Testing
- Approach I: Conducting Multiple Binomial Tests
- Enhanced Online WST Testing
- Instance-wise Upper Bounds and Exploiting Negligibility of Edges
- Proof of Thm. 5.8
- Approach II: A Likelihood-ratio Based Approach
- The Likelihood-ratio Test Statistics
- Passive WST Testing
- Proof of Thm. 5.13
- Active Online Testing
- Sequential Update Formulas for the LRT Statistics
- Asymptotic Behaviour of the LRT Statistics
- Proof of Thm. 5.17
- Proofs of Proposition 5.18 and Lemma 5.20
- Notes on Other Types of Stochastic Transitivity
- A Reduction to Pure Exploration Multi-Armed Bandits
- Empirical Evaluation
- Discussion and Related Work
- Learning Problems in Multi-Dueling Bandits
- Conclusion and Outlook
- Remaining Proofs for Chapter 2
- Remaining Proofs for Chapter 3
