Marketing Bandit Optimisation
Marketing bandit optimisation is the use of multi-armed bandits to continuously allocate marketing decisions across subject lines, timing, creative, offers, channels, and other campaign variables while balancing exploration and exploitation.
Key points
- Hightouch frames the marketer’s dilemma as “stick with winners or test everything”: a campaign with 5 subject lines, 4 send times, 3 offer types, and 2 creative templates already creates 120 possible combinations [src-025].
- Traditional A/B testing struggles with this combinatorial space because testing one or two variables at a time can take weeks or months, often after the campaign’s peak moment has passed [src-025].
- Multi-armed bandits solve the allocation problem by shifting send volume dynamically as performance data arrives, making thousands of small adjustments rather than using fixed traffic splits [src-025].
- The send-time example moves from equal 20 percent allocation across five times, to early exploitation of a 2 pm winner, to refined allocation across 2 pm and 5 pm while still reserving some exploration [src-025].
- Common marketing uses include email campaign components, send-time optimisation, channel orchestration, offer strategy, creative/content testing, and combinations of dimensions such as content theme, offer type, send frequency, and channel preference [src-025].
- The limitation is that standard multi-armed bandits find the best option on average. For true 1:1 personalisation, Hightouch positions Contextual Bandits as the next layer because they combine bandit allocation with individual customer data [src-025].
- Hightouch’s contextual-bandit article expands that limitation: a “winning” strategy can work for price-sensitive Marcus while alienating premium Sarah, so average optimisation must be augmented with customer context [src-026].
- Braze frames the same marketing pattern as real-time experimentation embedded in customer engagement platforms: multiple bandits can run in parallel, each tied to a campaign, channel, or optimisation goal [src-027].
- In Braze’s use cases, arms can include creative, offers, channels, timing, CTA design, subject-line themes, notification tone, onboarding forms, or retention incentives [src-027].
- Braze’s Intelligent Selection layer treats each campaign interaction as feedback that can update confidence across variations and shift exposure toward higher-performing options while the campaign is live [src-027].
- Statsig’s parallel A/B testing article provides the fixed-allocation counterpart to this combinatorial problem: multiple campaign or product variables can be tested simultaneously when interaction effects are modeled instead of avoided by default [src-029].
Related entities
Related concepts
- Multi-Armed Bandits
- Dynamic Traffic Allocation
- Exploration-Exploitation Trade-off
- AI Decisioning
- Agentic Marketing
- Contextual Bandits
- Customer Feature Matrix
- Intelligent Selection
- Parallel A/B Testing
- Treatment Interaction Effects
Source references
- [src-025] Hightouch — “Under the hood of AI Decisioning, part three: Multi-armed bandits”
- [src-026] Hightouch — “Under the hood of AI Decisioning, part four: Contextual bandits”
- [src-027] Team Braze — “What is a multi-armed bandit? Smarter experimentation for real-time marketing”
- [src-029] Allon Korem and Oryah Lancry-Dayan — “You can have it all: Parallel testing with A/B tests”