Query-Ad-Clustering
Query-ad-clustering is the algorithm Lu, Pál, and Pál propose for Lipschitz contextual bandits: cluster similar contexts, discretise the action space, and run an upper-confidence bandit strategy per context cluster.
Key points
- The algorithm works in phases. At the start of each phase, it partitions the query metric space into clusters of bounded diameter and chooses a finite representative subset of ads that covers the ad metric space [src-020].
- When a query arrives, the algorithm identifies its query cluster and selects from the representative ads using an upper confidence index based on empirical payoff plus an uncertainty radius [src-020].
- The intuition is practical: if nearby queries have similar click-through behaviour, one can pool evidence inside a query cluster instead of learning a separate ad model for every possible query [src-020].
- The algorithm becomes more refined over phases as cluster radii shrink, trading coarse early learning for finer context/action resolution later [src-020].
- The paper proves an upper regret bound for query-ad-clustering and a matching lower-bound style result showing essential optimality for finite spaces or bounded Euclidean spaces [src-020].
Related entities
_(none yet)_
Related concepts
- Lipschitz Contextual Bandits
- Contextual Bandits
- Exploration-Exploitation Trade-off
- Sponsored Search Ad Ranking
Source references
- [src-020] Tyler Lu, David Pál, Martin Pál — “Contextual Multi-Armed Bandits” (AISTATS 2010)