This may be the most pervasive myth in conversion optimization. It’s too easy (and effective) for a blogger to write a post of “101 Conversion Optimization Tips” or “150 A/B Test Ideas that Always Work.”
These articles make it seem like conversion optimization is a checklist, one you can run down, try everything, and get massive uplifts. Totally wrong.
Let’s say you have a list of 100 “proven” tactics (by “science” or “psychology,” even). Where should you start? Would you implement them all at once? Your site would look like a Christmas tree. (Even if you tried to test the 100 tactics one by one, the average A/B test takes about four weeks to run, so it’d take you 7.5 years to run them all.)
Some changes will work. Some will cancel each other out. Some will make your site worse. Without a rigorous, repeatable process, you’ll have no idea what had an impact, so you’ll miss out on a critical part of optimization: learning.
2. “Only three in 10 tests is a winner, and that’s okay.”
A good process is key to optimization. If you have a good process, you can prioritize work and know which tactics to try (or ignore) and the order in which to try them.
You would also have a good indication of where the problems are because you did the research—the heavy lifting.
“If you aren’t planning tests to understand where the wins are—playing your CRO version of battleships or minesweeper to narrow the odds on the next one (and improve your odds of not hitting a mine, which is always undervalued when considering ‘wins’)—then WTF are you doing?
Coming up with 10 ideas with no reason for them, firing them up, and hoping one sticks? Yeah, no wonder you accept a 3 in 10 ‘hit’ rate as good. If I have 3 ‘wins,’ I want 6 ‘I learned where the win is next time/what to avoid next time’ and 1 ‘fuck knows, should have worked, no idea why it didn’t.’
The ‘come up with ideas, fire them out, and see what sticks’ was what we used to call the Shotgun of Shit at Maxymiser. Yeah, you’d get a win. Statistically, you would be unlikely to not have a true win, a false win, and a false loss in a general scattergun approach.
But you don’t know which is which. And the team and the growth is paid for by repeating that success. If you load up another 10 and get a completely different set of win rates, you can’t plan or grow the team, so you can’t improve your ability. And it all ends up spiraling down.”
“Yes, you can get value from the things that didn’t work, and in fact you often get far more value from the patterns of things that don’t work than those that do. That statement, however, is about individual options, not the larger tests themselves.
You should never accept a non-winning test as a good thing, and you should be doing everything to make sure you are shooting for 100% of your tests to produce a clear, actionable, valid winner.
Every test that you run that does not have a clear and meaningful winner toward your organization’s bottom line screams that you have allowed biases to filter what you test and how you test. It is a sign that you are just spinning your wheels, and that you are making no effort to tackle the real problems that plague a program.
There is never a time that a failed test is acceptable, and there is never a time when you should be okay with a 25%, 50%, or 12.5% (the industry average) success rate on your tests.
Every test, positive or negative, is a chance for optimization. Not just of the things on your site, but for your own practices and for your organization.”
3. “Split testing is conversion rate optimization.”
“Testing is a common component of CRO. And I am glad that some online marketers have belatedly gotten the ‘testing religion,’ as I call it. However, it is only a tiny part of optimization.
Basically, testing is there to validate the decisions that you have made, and to make sure that the business has not suffered as a result. But even if you are doing testing well—with proper processes, traffic sources, statistics, and methods for coming up with ideas—the focus should still not be exclusively on testing activities.
If you simply worry about testing velocity or other tactical outcomes, you will miss the larger opportunity. Unless you see CRO as a strategic activity that has the potential to transform your whole business, you run the risk of it becoming simply a tactical activity that is a part of your online marketing mix.
4. “We tried CRO for a few weeks. It doesn’t work.”
Often, companies will throw in the towel if results don’t appear immediately.
Tony Grant offered a hypothetical client quote that all too many optimizers are likely familiar with: “I tested my CTA colour once and saw absolutely no increase in sales. A/B testing isn’t working for us.”
“Most nascent website optimization projects die a quick death in the first months when tests fail to deliver results. This is unfortunate for the companies that end up walking away, and very good for their competition.
The conclusion of nervous executives is, ‘Optimization doesn’t work for us,’ or ‘Our site is already awesome.’ My answers are, ‘It will and it’s not,’ in that order.
The problem is not that they’re doing testing wrong, or that their audience is somehow immune to testing. The problem is more likely that they chose the wrong things to test.
We offer an aggressive 180-day program of testing. Even with our big brains and lab coats, it takes a lot of work to figure out what to test in those early months. We expect to have inconclusive tests, even with our deep evaluation.
I say to the optimization champions in companies all over: Be patient, diligent, determined. Push through those early inconclusive tests before giving up the fight.”
“[Conversion optimization’s] importance and influence on business growth should mean that business leaders, decision makers, and stakeholders all truly understand and value conversion optimization.
Unfortunately, we are still nowhere near this being the case. Why is this poisonous? As the industry slowly matures, there will continue to be a huge amount of poorly planned and executed testing delivered either in-house or through agencies.
All this is going to do is drive more misconceptions about the real art of conversion optimization, such as ‘it’s just tweaking things’ or ‘it’s just a tactic we need to use as part of our overall strategy.’
Someday, we will get to the point where businesses truly understand the importance of optimization to help evolve and grow their business. This day is still a long way away, unfortunately. Education and enlightenment, therefore, still have a huge role to play in helping reach this day.”
Yet, too often, optimization is (mis)used to validate those biased opinions.
“The point of optimization is to maximize returns and to increase efficiency, and fundamentally focusing on validating an opinion of anyone—be it the optimizer or anyone else—is going against that purpose. Even worse, it stops people from really exploring or understanding optimization beyond that scope.
The only thing that is possible from overly focusing on opinion is to limit the scope of what you test and to allow the limitations of your imagination to control the maximum outcome of your testing program.
It is scary and great how often things that people think should never win do and how rarely things that everyone loves are the best option—yet so many experts tell you to focus only on what they want you to test or what their proven program for creating hypotheses tells you to test.”
Similarly, we’re tempted to tell post-test stories. When we find a winner, we use storytelling to validate our opinions.
“People will turn to user feedback, surveys, user observations, user labs—or just their own intuition—to divine the great reason why something happened. This constant need to figure out why makes people go on great snipe hunts, which result in wasted resources and almost false conclusions.
They do all of this despite the fact that there is overwhelming evidence that people have no clue why they do the actions they do. In fact, people do an action and then rationalize it afterwards.
Even worse, people pretend that this augury provides additional value to the future. In fact, all it is doing is creating false constraints going forward.
The issue is not that you are right or wrong—you may very well be right. The issue is that the evidence used to arrive at that conclusion is faulty, and there is no evidence to support any conjecture you make.
One of the most important rules, and definitely the one that takes the most enforcing, is ‘no storytelling.’ It is not for me to say that you are right or wrong, but the very act of storytelling allows the perception of understanding when there is no evidential reason for it.”
6. “Just do what your competitors are doing.”
The internet is brimming with conversion optimization case studies, so it’s tempting to fall into the trap of stealing others’ test ideas and creative efforts.
“I hear this all the time. ‘Competitor X is doing Y. We should do that, too,’ or ‘X is the market leader, and they have Y, so we need Y.’
There are two things wrong with this reasoning: First, the reason they set up Y (menu, navigation, checkout, homepage layout, etc.) is probably random. Often, the layout is something their web designer came up without doing a thorough analysis or testing. (In fact, they probably copied another competitor.)
Second, what works for them won’t necessarily work for you. You’d be surprised by the number of people who actually know their shit. Plenty of decisions are (maybe) 5% knowledge and 95% opinion.”
Most case studies don’t supply full numbers, so there’s no way of analyzing the statistical rigor of the test. Plenty are littered with low sample sizes and false positives—one reason why most CRO case studies are BS.
Even if the test were based on good statistics, you’re ignoring context. Your competition has different traffic sources, branding, customers, etc.
“Even if the case studies are based on sound statistical analysis (unlikely), and even if your competitors have a well-optimized design (also unlikely), you’d still be copying the end result and ignoring the process that created it.
We once created a new landing page for a SaaS company, which got a huge amount of press. It doubled their conversion rate from visit to paying customer. But then the copycats came—they copied the page structure and style, and sometimes even parts of the copy and HTML.
Unsurprisingly, this didn’t work out for them. In fact, one even complained to me that the original result must have been flawed.
But this is the same behavior as the cargo cults of Melanesia. After WWII, some South Pacific islanders copied the uniform and actions of the military personnel who had been based there, expecting it to result in planes appearing full of cargo.
At its core, conversion optimization is a simple process: Find out why people aren’t converting, then fix it. By copying others, you ignore the first part and are less likely to succeed at the second.”
Also, if you’re spending time copying competitors or reviewing shady case studies, you’re not spending time on validated learning, exploration, or customer understanding.
7. “Your tool will tell you when to stop a test.”
Tools are getting more robust and offering a more intuitive understanding of how to run a test. But you still have to learn basic statistics.
Statistical knowledge allows you to avoid Type I and Type II errors (false positives and false negatives) and to reduce imaginary lifts. There are some heuristics to follow when running tests:
Test for full weeks.
Test for two business cycles.
Set a fixed horizon and sample size for your test before you run it. (Use a calculator before you start the test.)
Keep in mind confounding variables and external factors (e.g., holidays).
“The ‘run experiments for two weeks’ idea started with large tech companies like Microsoft, Amazon, Google, and Facebook. Sample sizes even for a week of data are often in the hundreds of millions and, hence, sizing based on power is largely irrelevant.
Two weeks was the standard amount of time a single customer’s behavior fluctuated across a short period. For example, a Google user can have drastically different behavior from Monday to Friday and Friday to Saturday, but very rarely from Monday 1 to Monday 3.
A big part of this is because of the frequency that users returned to these sites, which allowed behavior to be easily modeled and, hence, predictable/cyclical.
The other issue is that most (if not all) of these big tech companies have built in cohort analysis to their experimentation platforms. For example, it is not recommended you look at a test directly after it completes, especially when behavior differs in a single sample across time.
These platforms continue to collect data from customers that entered the experiment late (Day 14) for an additional 14 days so that all customers are in the test for the same period of time.
When smaller companies attempt to replicate this, they get all sorts of problems: under-powered experiments, totally unreliable data because of the cohort issue, a tendency for experiments to ‘sign swap,’ etc.
Optimally, organizations should first understand the cyclical nature of their customers’ behavior and design their experiment time based on that—could be 7 days, 2 weeks, 1 month, 2 months.
Obviously, a longer run for an experiment raises other issues, like the impact of cookie expiry. There are risks that have to be balanced.”
8. “You can run experiments without a developer.”
Your testing tool’s visual editor can do it all, right? Nope.
It’s very easy to underestimate the complexity of the front end given the rise of JS Frameworks and recent browser changes.”
“Saying that non-devs can set up A/B tests now is misleading and wrong. Yes, you can change button color and make copy changes easily, but that’s not what one should do. You should test what actually matters and what actually needs to be tested (based on your conversion research).
There might be cases where you need to test something quickly and easily. If that’s really what you think would be best, visual editors will help you do it.
But you are limited by what the visual editor allows you to test—you will start testing stuff that makes no difference. Instead of testing what matters, you test what your lack of skills enable you to test, which wastes everyone’s time and money.
Your tests should tackle real issues. Your test variations should cover a wide range of ideas—and no idea for a treatment should be limited by the lack of coding skills.
The moment you try to set up a more complicated test via visual editors, there’s close to a 90% chance that it messes up the code. Variations will look wonky and won’t work properly on half the browsers and devices. QA is hard as it is, and now you let people with no coding skills—and most likely clueless about QA testing—set up tests?”
9. “Test results = long-term sales.”
Not every winning test will prove, in the long run, a winning implementation. Too often, as Fiona De Brabanter lamented, tests return a:
Ridiculously high increase in conversion rates but not the actual sales to show for it.
So why the imaginary lifts? Often, it results from stopping tests too soon.
“You should know that stopping a test once it’s significant is deadly sin #1 in A/B testing. Some 77% of A/A tests (same page against same page) will reach significance at a certain point.
You want to test as long as possible—at least one purchase cycle. The more data, the higher the statistical power of your test! More traffic means you have a higher chance of recognizing your winner on the significance level you’re testing on!
Small changes can make a big impact, but big impacts don’t happen too often. Most of the times, your variation is slightly better, so you need a lot of data to be able to notice a significant winner.
But, if you tests lasts and lasts, people tend to delete their cookies (10% in two weeks). When they return to your test, they can end up in the wrong variation.
So, as the weeks pass, your samples pollute more and more—and will end up with the same conversion rates. Test for a maximum of four weeks.”
Other factors can create a gap between test results and post-implementation performance.
“You could argue that changing a headline is making multiple changes, since you are changing more than one word at a time. So there, let’s say the headline is the smallest meaningful unit.
But maybe for another situation, the smallest meaningful unit is a page—and there are a few radically different page designs to test. It all depends on your situation.”
At its heart, optimization is about balance.
“It’s ‘test one concept/lever at a time’ (or collection of related concepts) but not test only one change at a time. It’s a balance—a pragmatic choice between the clarity of the answer and the speed/cost of testing.
If it’s, ‘We’re going to test free shipping and adding videos and a new banner headline and measure it on revenue,’ you’re going to struggle.
Test the concept: ‘The product detail page needs more clarity in the action area—remove/move social shares, manufacturer-provided fluff text, and twelve different ways to check out.’ Yeah, lots of changes but one concept—clarity.
It’s pragmatic optimization. We can pretend it’s scientific, but there are too many moving parts for it to be isolated in a truly controlled ‘lab.’
But, we can use the scientific process to narrow the odds, play fast and loose when we need to move fast, be super careful and test repeatedly where we need to edge toward the balance, and establish where ‘too far’ is and where ‘just far enough’ is.”
For the integrity of an industry, it’s important to know the destructive myths and mistruths. That way, those beginning to learn will have a clearer path, and businesses new to optimization won’t get discouraged with disappointing (i.e. non-existent) results.
There are many more myths that we probably missed. What are some destructive ones that you’ve come across?
An earlier version of this article, by Alex Birkett, appeared in 2015.