As a marketer and optimizer it’s only natural to want to speed up your testing efforts. So now the question is—can you run more than one A/B test at the same time on your site?
Let’s look into the “why you shouldn’t” and “why you should” run multiple tests at once.
What you should consider when running multiple simultaneous tests
Let’s first set the stage.
A user comes to your home page, making them part of test A. They then move on to the category page, and get to be a part of test B. Next, they go to the product page — which shows them test C. After, they add a product to cart — and are entered into test D. Finally, they completing checkout and test E is in full effect.
The user ends up buying something, and “conversion” is registered.
- Did any of the variations in those tests influence each other, and thus skew the data? (Interactions)
- Which variation of which test gets the credit? Which of the tests really nudged the user into buying something? (Attribution)
Andrew explains why running multiple separate tests at the same time might be a bad idea:
A testing tool vendor, Maximyser, advocates that running multiple tests at the same time results in low accuracy:
It’s possible that the “interactions” between variants in the two tests are not equal to each other and uniformly spread out.
The argument is that there are cases where interaction effects between tests matter (often unnoticed), but it can have a major impact on your test conclusions. According to them, instead of simultaneous tests, it’d be better to combine the tests and run them as MVT.
As with most things in marketing and A/B testing, not everyone fully agrees:
Matt Gershoff recommends you figure two things out before determining whether to run multiple separate tests at once:
If the split between variations is always equal, doesn’t it balance itself out?
According to Optimizely:
Even if one test’s variation is having an impact on another test’s variations, the effect is proportional on all the variations in the latter test and therefore, the results should not be materially affected.
Some think this model is oversimplified, and the argument also implies that attribution is not important.
The question then is, do you really care about attribution?
You may or may not. If we want to know what really impacted user behavior, and which of the tests (or a combination of tests—something you can explore with a testing tool like Conductrics) was responsible, then attribution does matter.
Are you in the business of science, or in the business of making money?
The success of your testing program is comprised of the number of tests run (e.g. per year), the percentage of winning tests, and the average impact per successful experiment.
Now if you limit the number of tests you run for the sake of avoiding data pollution, you are also significantly reducing the velocity of your testing.
If your primary goal is to figure out the validity of a single test, to be confident in the attribution and the impact of the test, then you might want to avoid tests with overlapping traffic. But while you do that, you are not running all those tests that might give you a lift—and as a result potentially losing money.
Do you care more about the precision of the outcome, or making money?
Here’s what Lukas Vermeer thinks about running multiple tests at once on the same site:
Lukas also confirmed that he is running simultaneous tests himself.
Choose the right strategy
Ultimately, we want to run more tests, but we also want the results to be accurate. So what are the options that we have available to us? Matt Gershoff has done a great job explaining the options here, related article also on the Maxymiser blog. I’m summarizing the 3 main strategies you should choose from:
1. Run multiple separate tests
Unless you suspect extreme interactions and a huge overlap between tests, this is going to be OK. You’re probably fine to do it, especially if what you test is not paradigm-changing stuff and there’s little overlap.
2. Mutually exclusive tests
Most testing tools give you the option to run mutually exclusive tests, so people wouldn’t be part of more than one test. The reason you’d want to do this is to eliminate noise or bias from your results. The possible downside is that it might be more complex to set this kind of tests up, and it will slow down your testing as you’ll need an adequate sample size for each of these tests.
3. Combine multiple tests into one, run as MVT
If you suspect strong interaction between tests, it might be better to better to combine those tests together and run them as an MVT. This option makes sense if the tests you were going to run measure the same goal (e.g. purchase), they’re in the same flow (e.g. running tests on each of the multi-step checkout steps), and you planned to run them for the same duration.
MVT doesn’t make sense if Test A was going to be about an offer and Test B experimenting with the main navigation – low interaction.
How to balance testing speed and accuracy of results?
Testing speed and accuracy of test results is a trade-off, and there is no single right answer here, although these three experts recommend similar approaches:
Like with most things business and marketing, there’s no easy answer.
In many cases, running multiple simultaneous tests makes sense. Unless you’re testing really important stuff (e.g. something that impacts your business model, future of the company), the benefits of testing volume will most likely outweigh the noise in your data and occasional false positives.
If based on your assessment there’s a high risk of interaction between multiple tests, reduce the number of simultaneous tests and/or let the tests run longer for improved accuracy.