de
en
Schliessen
Detailsuche
Bibliotheken
Projekt
Impressum
Datenschutz
Schliessen
Publizieren
Besondere Sammlungen
Digitalisierungsservice
Hilfe
Impressum
Datenschutz
zum Inhalt
Detailsuche
Schnellsuche:
OK
Ergebnisliste
Titel
Titel
Inhalt
Inhalt
Seite
Seite
Im Werk suchen
Testing coherence and identifying winners in dueling bandits: theory and algorithms / written by Björn Haddenhorst ; supervised by Prof. Dr. Eyke Hüllermeier. Paderborn, 2023
Inhalt
List of Publications
List of Symbols
List of Abbreviations
List of Algorithms
Introduction
The (Multi-)Dueling Bandits Setting
Modeling Assumptions in (Multi-)Dueling Bandits
Outline and Contribution of this Thesis
Related Work
Notation and Conventions
Fundamentals
Probabilistic Prerequisites
Concentration Inequalities and Convergence Results
Sequentially Testing for the Mode of a Biased Coin
Lower Bounds
Upper Bounds
Lower Bounds for Multiple Coin Problems
Sequentially Testing for the Mode of a Biased Die
Lower Bounds
Upper Bounds
Empirical Evaluation
Mode Identification of a Coin
A Change-of-Measure Argument
Application: Impossibility Results
Discussion and Related Work
Graph-Theoretical Considerations
Basic Terminology
Deterministic Sequential Testing Algorithms
Properties in Extension
CW in Extension
DSTAs for CW-related Problems
Acyclicity in Extension
DSTAs for Testing Acyclicity of Tournaments
Discussion and Related Work
Learning Problems in Dueling Bandits
Testification for the Existence of a Condorcet Winner
Lower Bounds
Solutions for CW Testification
Naive Approaches
Noisy Tournament Sampling
The Passive Scenario
The Active Scenario
Proof of Theorem 4.6
Solutions to Other CW-Related Problems
A Reduction to Pure Exploration Multi-Armed Bandits
Sticky Track-and-Stop for CW Checking
Sticky Track-and-Stop for CW Testification
Empirical Evaluation
The Active Scenario
The Passive Scenario
Discussion and Related Work
Testing for Stochastic Transitivity
Impossibility Results for Several Types of Stochastic Transitivity
Lower Bounds for WST Testing
Approach I: Conducting Multiple Binomial Tests
Enhanced Online WST Testing
Instance-wise Upper Bounds and Exploiting Negligibility of Edges
Proof of Thm. 5.8
Approach II: A Likelihood-ratio Based Approach
The Likelihood-ratio Test Statistics
Passive WST Testing
Proof of Thm. 5.13
Active Online Testing
Sequential Update Formulas for the LRT Statistics
Asymptotic Behaviour of the LRT Statistics
Proof of Thm. 5.17
Proofs of Proposition 5.18 and Lemma 5.20
Notes on Other Types of Stochastic Transitivity
A Reduction to Pure Exploration Multi-Armed Bandits
Empirical Evaluation
Discussion and Related Work
Learning Problems in Multi-Dueling Bandits
Identification of the GCW in Multi-Dueling Bandits
Lower Bounds on the GCW Identification Problem
Upper Bounds on the GCW Identification Problem
Empirical Evaluation
Comparison of DKWT and PPRT with PAC-Wrapper
DKWT and PPRT versus SELECT, SEEBS and Explore-then-Verify
Comparison of DKWT with Alg. 25
Discussion and Related Work
Conclusion and Outlook
Remaining Proofs for Chapter 2
Remaining Proofs for Chapter 3
Die detaillierte Suchanfrage erfordert aktiviertes Javascript.