In many ways conversion rate optimization is really decision optimization. If you are doing it right, you will constantly discover that what you thought mattered – actually doesn’t.
Optimization work and testing challenges everything you think you know about marketing and your users. This can lead to some introspection about why you think you know things and what lead you to hold those erroneous beliefs in the first place.
Table of contents
Enter the world of cognitive psychology and cognitive biases
The number of ways that people can and do misuse information or confirm their already held beliefs is simply mind boggling. A quick journey down the hundreds of known biases and fallacies can lead one to question how anyone ever gets anything right.
While these biases do impact just about everything we do there are ways to help mitigate their impact and to leverage their existence to improve your program. The first key however is to identify what ones you will be dealing with the most and to help put systems in place to exploit their existence. With that in mind here are the top 5 biases that you will face in your time running an optimization program, what they are, and what to do to help mitigate their impact.
Keep in mind that these biases impact everything part of every day in your life, work and otherwise, and you should keep an eye out for this behavior in all cases, not just optimization.
Bias #1: Congruence Bias
We love to create false comparisons and then pretend they answer our primary question.
What it is: Congruence bias is the name that is given when we create false dichotomies of choices and then think we have answered a question when we have really just chosen between a limited set of options.
I know that sounds wordy but what this really means is that you create a false choice such as one banner versus another or one headline versus another, and then test to see which is better. In reality there are hundreds of different ways that you can go but we become blinded by our myopic view of an answer. You create a false choice of what is there versus what you want to see happen.
Even worse with this bias is that you will get an answer, it just won’t mean much. Just because the new headline you wanted to try is better, it doesn’t mean that it was good choice. It also means that you can’t apply too much meaning in limited comparisons because you are in reality just validating a flawed hypothesis. It is the job of the researcher or tester to compare all valid alternative hypothesis, yet we get stuck on the things we want to see win and as such we get trapped in this bias on an almost hourly basis.
Example: You feel you need to improve your landing page so you choose to test out a new call to action. In reality you need to figure out what matters most and try out many different options.
What you should do: One of the key disciplines of a successful program is to always be looking to maximize resources and to challenge what people think is right. This means avoiding congruence bias by making sure you are comparing a large range of possible answers and by challenging the current mindset. Always make sure that you avoid only testing what you think will win and also ensure that you are designing tests around what is feasible, not just the most popular opinion about how to solve the current problem.
Bias #2: Dunning-Kruger Effect
The more you don’t know, the more sure you are that you know everything.
What it is: We all know that person we work with that always thinks they know all the answers and is always sure they have it all figured out, yet we all know they have no clue what they are talking about. That is Dunning-Kruger effect in action, where people who are incompetent don’t know that they are incompetent, and more importantly they are more likely to super confident in their actions because they don’t know what they don’t know. This is why you can have people talk about how amazing their experience is and how great their results are when in fact they are a negative influence on results for everything they touch.
All jokes about industry “experts” aside, this is a big deal because there is also a direct correlation between sociopathic tendencies and positions of power in most businesses, and this is one of if not the primary cause of this. The corollary effect of people who know a lot about a subject being less confident leads to cases of persuasion winning over functional knowledge. If you ever want to know why people still have “I think/I believe/I feel” conversations this is why.
Example: Obviously the call to action is the most important, why would you ever test the layout of the page? The more sure you of something, the more important it is that you challenge that assumption.
What you should do: This is where having a rational decisioning system in place prior to starting a test is so important. You have to be able to put every idea through a system that maximizes impact to the bottom line of the business and not just their egos. It doesn’t matter what someone thinks will happen or what they were trying to accomplish, it only matters if the change impacted the bottom line of the business.
By enforcing such a stringent discipline for measurement and by making sure that you avoid only validating ideas you can ensure that the limits of someones own understanding does not also limit the possible outcome of your optimization efforts.
Bias #3: Narrative Fallacy
We love to answer why even though it is impossible to know why something happened from any form of available information.
What it is: People love a good story, and there is no more requested story then why. In all honesty 90% of the time wasted in our industry is people professing some deep understanding of why something happened or creating elaborate stories or in depth presentations persuading you that they have a deep insight into the persona of your users.
Why did a certain headline win? Obviously because it was persuasive. Why didn’t people go to your site? Because it wasn’t relevant…
Anytime you hear someone explain why something happened or anytime you try to take too much meaning out of a test result you are facing the Narrative Fallacy in the face. People feel empty when they get a result that goes against their beliefs or that goes against conventional wisdom so they inherently starting searching for why. In some organizations executives don’t care about the results, only the why. Beware anytime you or anyone else ever starts explaining why something happened.
The problem is that it is impossible to tell why. Even if you ask someone to their face you are only going to get their rationalized version of what happened, not the actual series of things that lead to that outcome. You can not say that an event is related to another or is the cause of another when you get only 1 data point from a test, and yet that is the maximum you can get from a single result in a test, be it that both metrics went up, or down, or inverse of each other, the maximum amount of data is a single data point.
Example: You know that the CTA that you tested won because it obviously was more relevant to the context of what your users were looking for. In reality you only know that it won and anything else you say past that can only lower the value of future actions.
What you should do: Never, and I can not stress this enough, never explain why something happened. You may feel 100% certain (see Dunning Kruger above) that you know why, but avoid going down that road at all times. Not only are you guaranteed to be basing your conclusion on no real information, but by you opening this door you are allowing others to also start getting into story telling. Stop these conversations in their track as they only distract from making the correct decisions on how to act on the data and what to do next.
In many cases this is the act that will cause the most cognitive dissonance with your group as this goes against just about all human nature. This is why it is also important that you establish rules up front and spent your time educating people on how to act on data before you ever get to the point of actual action. Success and failure is determined before you launch a test, not by the results after.
Bias #4: Graveyard of Knowledge
Winners tell us nothing yet they are the ones we always turn to for information.
What it is: Go to any bookstore and look at the business section and all you will see is books written by people from majorly successful companies. We love to hear from entrepreneurs and leaders who have gone the extra distance and have really achieved real results.
What you won’t see is books by people who didn’t succeed, who didn’t get to be a rockstar. The funny part is that those people we don’t talk to or that didn’t succeed offer so much more information than those who did. Not only are they a much larger source of information, but those that do succeed often downplay luck and random success and also over import small things that may or most likely do not play a part in their success.
Also known as Survivorship Bias, the graveyard of knowledge is a bias that makes us only look at those that are there at the end and try to apply all knowledge from that group. We look at those that did purchase for common traits, not those that did not. We create personas of the people who buy our products, but don’t really give much consideration for the people who don’t buy, despite the fact that they represent just about every person on this planet. We get so caught up in looking at what was there at the end that we lose massive amounts of information and lose all the value from that information.
Example: You really want to target your CTA based on if the user is coming from google or not. In reality the message could work for anyone and you might need to change the experience based on a completely different factor, like browser or time of day.
What you should do: Never presume that you have a read on your users at the exclusion of other possibilities. Be it in the content you make or the way you think you are creating a page, always look at the various possibilities and allow the data to dictate the direction. This is especially true for personalization and for segmentation. You have to always serve all versions to every user and make sure that you look at all feasible segments, even if you really want to target a message based on a behavior or source. You have to make sure that you are maximizing outcomes, not maximizing your own opinions.
Bias #5: The Halo Effect
The more we like one trait the more we like all traits of that person or object.
What it is: We all love the good looking pages that really resonate with us. We just know that the better looking page is the best performer, despite the fact that the most profitable page on the internet is a big white page with a search box in the middle. We also trust the experts that talk the best or who connect with us in a way that makes the information really stick. All of these are examples of the Halo Effect, where you apply positive or negative feelings to other traits based on a single trait of the object or person.
We really do trust good looking people and good looking pages more, despite the fact there is no correlation to actual outcomes. We assume that because some expert can speak well that this must mean that their information is better than those that aren’t as eloquent. We really do choose the tallest presidential candidate in almost all elections even though that should have no correlation with their ability to lead.
Example: Every page analysis that has ever taken place. You don’t like the look of the page so everything, especially that CTA needs to go. In reality your impression of a page or how much you like any part of the experience has no bearing on the value of that item to performance.
What you should do: Let people vote beforehand for what they think will do best and second best out of all the options you test. Do this even a couple of times and it will become apparent that no one, be it the most seasoned marketer or the new intern, can tell the performance of anything by just looking at it.
While some people are slightly better than others in choosing an option that is slightly better, very few if any will even come close to a 10% success rate in choosing the best option. Confronting this bias head on will also force you to test out things that people consider “ugly” or that go against their vision of the site, and when they win it allows you to really open up what the real best user experience should be.
It is easy to list all of these biases that shape our view of the world and the impact of the work we do. This look didn’t even get into actual experiment design biases like observer-expectancy or selection bias, because at the end of the day those don’t play as big a role as the ones that we experience every moment of every day. People as a whole are actually a very passionate and capable group, but they limit themselves so much by what they shut out from their consciousness and how they choose to see the world. By creating discipline and not allowing yourself to get trapped by these biases you allow them to really achieve results their other natural talents should allow them to do.
Most people don’t think that their decisions are flawed. They think they are being rational in the moment, though they will observe irrational behavior in others or in reflection of past decisions. In reality nothing we think about or do fails to be influenced by a multitude of shortcuts that our brain makes to allow us to get through the day. We have to be aware of these and create systems that mitigate the damage, else we will never come close to achieve what we can and should.
One last thing. Think that this is all about other people and that you are better than these silly biases? Keep one simple fact in mind. The smarter you are, the more likely you are to to fall victim to biases. So which is it, are you incompetent or should you start really putting effort in stopping biases from damaging your organization?
Join the conversation
Add your comment
I’m an analytics addict. To the extent sometimes of overanalysing rather than doing. Which doesn’t help. With that preamble aside, my concern, even though I agree with pretty much all you say here, is that the conclusion is test everything everywhere on everyone. Removing your own assumption on ‘why’, if you change a parameter somewhere, retest everything everywhere on everyone. The parameters are infinite. Where do you start? And how do you track the infinite variables in each of a multi-step process where each step also impacts the others?
I think the first thing to understand is that optimization and analytics are very very different disciplines, and one of the first ways that most organizations fail is that they don’t understand that and tackle testing from an analytics perspective. One of the key differences is that you must have a single success metric, and you must only look at that metric. The metric is not what you think will happen or what you are trying to drive, it is the bottom line impact to revenue for your site. Trying to get more people to use internal search? Use RPV only. Trying to get more newsletter signups, use RPV only. You are trying to drive other actions because the belief is that those actions drive additional revenue, so see if that is true. In the case it is, then RPV gives you the same answer and a cleaner look at impact, in the case that it is not true the only way you know is RPV.
As for where to start, the key is discovery. You can easily design tests to break down feasible options and figure out where to apply resources. Whether it is a inclusion/exclusion test, a partial factorial MVT, or any other number of tests that help tell you where to apply resources and what to do, you need to think in terms of that instead of jumping towards one thing that sticks out to you and/or just shotgunning everything. Its a discipline to go where the data tells you and not just where you want to go.
Let’s add one more: Confirmation Bias
Most of us will agree that we only recall the times where some luck saved us in the nick of time when in a troublesome situation. Over time, we tend to believe it will happen everytime but that’s just our bias because we are overjoyed in the counts of success to hardly remember the many times that we did not succeed.
Confirmation bias is really a subgroup of biases, though there is also the one with that specific name. While there is definitely some selective memory problems, I definitely think congruence bias, which falls under that umbrella is more prevalent, as well as observer-expectancy bias.
The biggest problems are not forgetting when things failed, they are when we think we have succeeded but have really found a really small increase and wasted resources. While I definitely think people should never be happy with a failed test, the reality is that most of what people consider a success is more of a failure then a test that did not produce a winner. Any form of validation is especially problematic in that regard, which is why you should be very wary of how most groups use hypothesis. That is where congruence, observer-expectancy, and many other confirmation biases really do the most damage.
Although I am a fan of most articles by you guys, this one had a negative connotation that I didn’t enjoy reading. The information is still good though. I just thought it could be framed better.
I agree with Randy. I enjoy most of the articles here but this one… didn’t seem to make sense as a whole. The response Andrew gave about RPV makes perfect sense to me though. Maybe I’m an idiot but I’d recommend taking another look at the article and seeing if there was a way to make it have more of a point with something actionable to walk away with.
The Dunning-Kruger effect, whilst appealing to the intuition, is not backed up by evidence.
Interesting article, one I had not come across before. That being said they only offer some possible alternative hypothesis, which I would love to see tested out. I do think that especially in the world of marketing there is a major lack of accountability so that people have no frame of reference for the real impact of their work, leaving people to judge their impact and skill based on the false belief of others in their impact, which can lead to a large group of unaware people.
It is also important to note that even in that article it still agrees in the concept:
This doesn’t mean that the meta-cognitive explanation preferred by Dunning, Kruger and colleagues can’t hold in some situations; it very well may be that in some cases, and to some extent, people’s lack of skill is really what prevents them from accurately determining their standing in relation to others.
I think Marketing it is ripe for its exploitation, but even if you think it doesn’t hold true, the reality is that there is not a normal distribution of outcomes (I definitely suggest Nassim Taleb, especially his book Black Swan) and even if they were they are not reflective of effeciency, only grouped incompetance.
I also think it is important to note that it is not that people are impacted by all of these biases at the same time, only that at any given time they are impacted by a number of biases. I tried to highlight the most damaging and common, but in reality the list is around a thousand or so biases and everyone is impacted by multiples of them at any given time. The real key is the discipline needed to fight their impact and a degree of accountability for all actions in terms of efficiency and business impact.
By far the most common and most damaging thing I have seen from working with so many different organizations (well over 300) are groups that think they are succeeding at testing because they run a number of tests, who in fact are just spinning wheels and causing damage to their organizations. No matter what name we give that extremely common phenomena, the thing to take away is that the people it impacts are not aware, and are not in any way trying to change. At the end of the day, it is that disconnect, which they are not aware of and that is nearly impossible to fight head on, that causes so much of the empty cotton candy type advice, case studies, and feedback that our industry is full of. It takes real understand and a want to fight the problem head on. Whether it is one specific bias or another is not as relevant as the system you put in place to assist rational decision making and accountability for all efforts, especially optimization.
Interesting as always, and your (Andrew) stuff to me is particularly so because it seems to run counter to many things you hear about CRO, even on CXL.
Here’s what I’m getting at: you seem to be quite opposed to certain forms of “hypothesis testing” based on narratives or interpretations of qualitative data. I understand how hypothesis testing proper can quickly turn into “opinion verification” and produce suboptimal results, but I’d be interested in hearing how you fit research about the customer in your overall framework.
Is it entirely useless? Is it a tool you use only informally to come up with testing variations, or sets of variations for tests, after which you don’t care about the interpretation ? If so, how do you “drill down” on further tests ? Do you simply create variations at random based on factor sensitivity ?
The best way to think about this is in terms of cost, function, and mental models. Every person in an org and in CRO has their own mental model about how things work. It doesn’t matter if that model is based on heuristics, math, customer research or astrology, the point is that each of those models help people understand the world around them and to come up with ideas about what the best experience is.
The key is to manage costs to build those models while increasing the function of each test. In other words, the key is to have as many different models in play as possible so you can add accountability to each idea and to weigh the best option, not just a better one. If it adds a quantifiable increase in cost to build those models (research, time, entire data systems) then there is a major reason to avoid that research. If there is minor additional cost, then it is not majorly negative, as long as you avoid congruence bias and story telling (the #2 thing, right behind multiple success metrics, that I am against).
As a tester I think it is vital to never be tied to one way of thinking about the experience. Allow others to have their mental models, and if one constantly beats all others, then learn from that pattern, but I honestly have never seen any model live up to an even 10% success rate (which the 14% test success rate helps confirm). By allowing as many different inputs and staying unbiased, you allow for patterns to emerge, especially when you manage the beta of ideas and executions to achieve exactly that. What lead others to reach a conclusion doesn’t matter as long as you are tied to the functional outcome, not the input.
Doing this also allows you to ignore or move past all the problems with qualitative research, be it the fact that people have no clue why they act, to selection and question bias. Everyone has their opinions, the key is to make sure that testing is never hampered by them or resources are not wasted in pursuit of them.
Great answer, thanks for taking the time to write it.
That seems to dovetail fairly well with Nate Silver’s stuff about thinking like a fox vs. like a hedgehog (and Peep’s point about CROs being polymaths): better to be able to look at many small models that give you a lot of things to test, however “contradictory” they may seem, than to apply the One Big Model (especially unknowingly) and get blindsided.
Would love to read more about the way you “manage the beta of ideas and executions” to allow patterns to emerge: are you talking about developing a risk/cost-adjusted view of the returns of different “mental models” over time, and observing correlations between them? Or did you mean that in a stricter sense, where “functional outcome” of an idea (e.g. change such and such element of a page) is then completely detached from any mental model?
The burden of proof has to be on the mental model and not the actual change. Objectively how you arrived at a variant is irrelevant, only the influence of the change. You may be right about why it had its influence, but there is no data that will back and that and you are far more likely to be wrong. Conceptually it is possible for a mental model to be proven out, but I can say in my 13+ years and with 300+ sites that I have yet to see it, though you should always be looking for black swans.
As to your comment about polymaths, one of the things that Peep and I fully agree on is that you need so many different disciplines and interests to be even average in CRO. I think the biggest problem with most programs is that they are run as an extension of analytics or an extension with marketing. Honestly in my experience the people who make the worst optimizers come from those two areas because they have the hardest time letting go of what they thought they knew. Optimization so much more than that and require such a deep understanding of so many different topics, especially statistics and cognitive psychology, that any confusion as tied to those disciplines is a disservice to both sides of that comparison. Even worse people put resources towards where they are familiar despite limited to zero actual revenue impact (I am looking at your social media). CRO requires such discipline and skill to not be a waste of time, but when done right blows the revenue impact of all other efforts for the entire organization out of the water. I wish more people understood it at least on that level enough to put the resources where they should go and not where they are used to.
Clearly there’s a dearth of educational material at that strategic level vs. the plethora of “tactical” articles that pop up every day on the web. Luckily for us, there’s you and CXL! :)
tnx a lot. your site is very Excellent and informative
Comments are closed.