Are Multi-Armed Bandit Tests Superior To A/B Tests?

Multi-armed bandits seek to aggressively optimise content

Multi-armed bandit tests seek to aggressively optimise content

 

What are multi-armed bandit tests?

 

Multi-armed bandit tests (MAB) uses an algorithm to proactively seek out the best performing experience and aggressively optimises to increase the average conversion rate during a test. This means that you can earn and learn simultaneously as traffic is automatically switched to the variant with the highest conversion rate.

What is a bandit?

 

A bandit is another name for a slot machine. Imagine that you were in Vegas with a limited budget and time to play a selection of slot machines with different pay-outs. Multi-armed bandit (MAB) tests seek to maximise your winnings by trying to work out which slot machine has the highest pay out and automatically adjusts resources (i.e. traffic) to optimise revenues.

 

This is very different from A/B testing where traffic is evenly split between each variant. In the example below there are two variants (a control and a challenger) and so each receives a 50% of all traffic from the beginning to the end of the test.

 

Multi-armed bandit tests work so that  for 10% of the time the traffic is split equally between the two variants (the exploration phase). For the remaining 90% of the test though it sends traffic to the best performing variant (the exploitation phase). Multi-armed bandit tests also provide the option to weight traffic according the estimated value of different variants from the beginning of the test. This is simply based upon a best guess approach. Below is an example of potential weights for a three variant experiment.

 

Image Source:
Image Source:

 

What about statistical confidence?

 

Multi-armed bandit tests aggressively optimises for the best performing variant by sending little traffic to the worst performing variant during the explorative phase. However, this will usually occur before full statistical confidence is obtained and so we may not be able to identify whether a   variant is indeed the worst performing variant or whether it’s just down to chance. This means that it will require a lot more traffic to reach full statistical confidence for poorly performing variants in an MAB test and thus take longer to get a conclusive result.

 

What Assumptions Do Multi-Armed Bandit Tests Make?

 

Most multi-armed bandit tests use algorithms which make a number of assumptions about conversion rates.

      • Serving a variant and observing a conversion happen instantaneously. This means that multi-armed bandit tests are not suitable for email marketing or where there is a significant time-lag between when a customer sees a variant and the conversion occurring.
      • Conversion rates are fairly constant and don’t significantly change over time. If your conversion rate is subject to substantial fluctuations over time due to factors such as the weather or other seasonal factors then MABs may not be appropriate.
      • Samples in MABs are independent of each other and so don’t influence the conversion rate.

What Are The Benefits of Multi-Armed Bandit Tests?

 

      • Exploit winning variants: MABs generally achieve a higher average conversion rates during the test period. They allow you to reduce the opportunity cost of testing by allowing for a smooth transition from exploration to exploitation to increase revenues.
Image of AB test compared to multi-armed bandit
Image source:

 

      • Automate optimisation: MABs allow you to automate the optimisation process with machine learning so that low performing variants can be dropped and traffic can be channelled towards the best revenue generating variant.
      • Continuous optimisation: Where you are frequently adding or removing variants to be tested it provides the flexibility that A/B testing is not designed for. If you want to add new variants to replace low performing experiences during the testing process MABs facilitate this. They also work well with targeting specific ads or content to customer segments.
      • Innovation tests: MABs perform best when there is a very large difference in the conversion rates of different variants. MABs are therefore best suited for optimisation when you have radically different experiences, such as in an innovation test, where you might expect to see big differences in the conversion rates of each variant.
      • Persuasive  profiling: MAB’s are suitable for persuasive profiling so that you identify what content works best for a particular personality trait.
      • Time is not a priority: When you are not in any rush to identify the best performing variant and want to optimise the average conversion rate MAB can be a suitable tool.

Disadvantages:

 

      • Traffic greedy: MABs require more traffic and more time to reach full statistical confidence. If you are not bothered about the average conversion rate during the test and need a speedy, but conclusive test result then A/B testing is probably the right methodology for you.
      • Needs large differences: When there is little difference between the conversion rate for each variant the benefit of multi-armed bandits disappear. This is a concern as we know from experience it is almost impossible to predict how much a difference a new design or heading will make to the conversion rate. The danger is that our own subjective opinions and biases come into play here which is what experimentation is designed to avoid.
      • More room for error: As bandits begin switching traffic before full statistical confidence is reached there is more danger that a variant that is performing better purely by chance will be selected as the winning experience. Conversely a variant that is initially performing poorly due to chance is more likely to be dropped by the algorithm and revenues lost.
      • Implementation is not easy: Setting up MABs is technically challenging as you may need a data scientist to advise on how to integrate and scale the code and a developer to program the test.

When Should You Use Multi-Armed Bandit Tests?

 

Multi-armed bandit tests are best suited to the following campaigns:

      • When you want to simultaneously explore and exploit an optimisation opportunity.
      • Optimising radically different variants where there is a need to begin exploiting the best performing experience without delay.
      • Headlines and short-term campaigns, particularly if the content has a limited time span.
      • Automation for scale.
      • Targeting to understand how different customer segments respond to content.
      • Combining optimisation with attribution. By including a bandit algorithm on your website and in your call centre automated software you can seek to optimise across multiple touch points.

Conclusion:

 

Multi-armed bandit tests are not an alternative to A/B testing as they are designed for different roles in the optimisation toolkit. A/B testing is excellent for conducting online experiments to identify the best performing variant with a high degree of statistical confidence. MABs are more suited to continuous optimisation and short-term campaigns where the objective is to achieve a high average conversion rate. Ideally you would want to use both A/B and MAB testing as part of a comprehensive optimisation program.

 

Thank you for reading my post. Please leave feedback below because it helps us improve the quality of our content.

  • About the author:  Neal  (@northresearch) provides web analytics and CRO consultancy services and has worked in many sectors including financial services, online gaming and retail. He has helped brands such Hastings Direct, Manchester Airport Group Online and Assurant  Solutions Ltd to improve their digital marketing measurement and performance.

 

By Neal Cole

Founder of Conversion Uplift and an expert in digital marketing and conversion rate optimisation.

Bandit Testing

Multi-armed bandits seek to aggressively optimise content

Bandit Testing Definition:

Bandit testing or Multi-Armed Bandits (MAB) is a testing methodology which uses algorithms that seek to optimise for your conversion goal during rather than after an experiment is completed. It does this by aggressively switching traffic to the highest converting variant as the test progresses with the aim to maximising the conversion rate.

This is especially suited to testing headlines and other content that has a very short lifespan (e.g. limited offer promotional content) and continuous, long-term testing. The chart below illustrates graphically how the bandit algorithm allows for a smooth transition of traffic from the exploration phase of testing to exploitation of the winning variant.

Image of AB test compared to multi-armed bandit
Image source:

In a bandit test for 10% of the time the traffic is split equally between the two variants (the exploration phase). But for the remaining 90% of the test the algorithm sends traffic to the best performing variant (the exploitation phase). MABs allow you to weight traffic according the estimated value of different variants from the beginning of the test. This is usually based upon experience but could still be completely wrong.

Multi-armed bandit algorithms begin to switch traffic to the best performing variant before full statistical confidence is achieved. This increases the risk that a variant will be chosen that has an above average conversion rate purely by chance and the uplift will therefore not be sustainable.

So when should you use MABs?

Multi-armed bandits are best suited to the following campaigns:

  • To simultaneously explore and exploit optimisation opportunities. This is potentially a  great growth hacking strategy.
  • Testing radically different variants where there is a high chance of large differences in conversion rates.
  • Headlines and short-term campaigns.
  • Automation for scale.
  • Targeting.
  • Combining optimisation with attribution.

Conclusion:

Multi-armed bandit algorithms are a different methodology from A/B testing and are not a replacement to traditional experimentation. A/B testing is still most suited for identifying the best performing variant with a high level of statistical confidence. However, bandit testing can be useful for short-term and continuous optimisation to obtain a high average conversion rate.

 

Resources:

Conversion marketing – Glossary of Conversion Marketing.

Over 300 tools reviewed – Digital Marketing Toolbox.

A/B testing software – Which A/B testing tools should you choose?

Types of A/B tests – How to optimise your website’s performance using A/B testing.

By Neal Cole

Founder of Conversion Uplift and an expert in digital marketing and conversion rate optimisation.