Example of Simpson's Paradox

Simpson’s Paradox

Definition:

Simpson’s Paradox (also called the Yule-Simpson effect) refers to the situation where a trend is noticeable in a number of different sample groups, but when these sub-sample results are combined the trend disappears or reverses. It often occurs when members of the population leave or join and when sub-groups have small or very different sample sizes.

Example of Simpson's Paradox

We can suffer from Simpson’s Paradox if we add percentages together rather than recalculating the percentage with the raw data. Otherwise we are not allowing for the different sample sizes that may cause Simpson’s Paradox. Sometimes the problem is due to comparing apples with oranges by not breaking down the data into component parts. Here conversion rates for IOs and Android suggest lower conversion for IOS.

Image of table showing conversion rate by operating system

 

However, if we analyse the data by operating system and device we see a very different picture that could influence investment decisions. The conversion rates by device are in fact pretty similar and we shouldn’t be concerned about IOs overall as it appears it is the type of device that is influencing the conversion rate.

Image of table showing conversion rate by Operating System and type of device

 

Commercial organisations can easily fall into the trap of Simpson’s Paradox because of over-reliance on small samples due to a desire to keep research costs to a minimum.  Confirmation bias often compounds the problem as people are too quick to look for data that aligns with their existing beliefs even if the sample size is small.

Consistency of results are often used to justify acceptance of the data, but this is of course one of the characteristics of Simpson’s Paradox.  For A/B and multivariate tests you could be subject to  Simpson’s Paradox if you run a series of small tests that don’t reach full statistical confidence or you don’t break the results down sufficiently to understand the underlying drivers of conversion.

The lesson here is two fold. Don’t stop an experiments until you have a high degree of statistical confidence and a low margin of error. Secondly be careful not to make generalisations about the test result without first exploring the dynamics affecting behaviour.  Use web analytics to dig into the data before making final conclusions.

With multivariate tests ensure you validate the winning recipe with a robust A/B test.  Interaction between the independent  variables can create misleading results with multivariate tests and you should consider running an A/B test to ensure the result is sustainable.

Resources:

Conversion marketing – Glossary of Conversion Marketing.

Law of Small Numbers – Why are many myths caused by social scientists?

Over 300 tools reviewed – Digital Marketing Toolbox.

A/B testing software – Which A/B testing tools should you choose?

Types of A/B tests – How to optimise your website’s performance using A/B testing.

Digital Marketing and Insight