I have this intuition for some time that there is a trade-off between strict design methodologies like A/B testing and creativity. I couldn’t quite put my finger on it, until a couple of days ago, when I realized that A/B testing is an evolutionary algorithm. Maybe things became more clear then because evolutionary algorithms are familiar territory to me.
Evolutionary computation consists of a family of algorithms inspired by biological (Darwinistic) evolution. You are looking for a solution in a problem domain. You randomly generate populations of solutions, measure the quality of each individual (we call that fitness) and then generate the next population by creating variations of the individuals on the initial one. The higher the fitness of an individual, the more likely it is to be selected to generate offspring. Hopefully, the best fitness found so far will tend to increase as you generate more populations. There is a very large body of research on this topic and I’m hand-waving many important details. But this is the gist of it.
A/B testing can be seen as one of these algorithms. It lies one the extreme case where each population only has two individuals (A and B). You measure their performance somehow, and then choose the best one. Again you try to improve the best one by introducing some tweak, rinse and repeat. Another difference to typical evolutionary algorithms is that in A/B testing, mutations are not random. They tend to be intentional, driven by the mental model of a human designer.
So why do evolutionary algorithms typically use populations much larger than 2? They need to maintain diversity to avoid local maxima. There is a fitness landscape that you are searching, and it’s shape is unknown. You’re looking to climb to the highest possible point. The problem is that, many times, the landscape is full of small, sub-optimal hills – local maxima. If your search method is too eager to go up as soon as possible, disregarding other options, it will get stuck on one of those hills. A/B testing is extremely eager to climb.
This eagerness arises from two factors: the very small population that does not protect us against lack of diversity and the mutation process. The mutations performed in A/B testing are far from random: they are designed. It is unlikely that they will stray far from current wisdom. After all, they are just small theories created by some human designer with some model of reality. The beauty of evolution is how the crazy mutations sometimes pay off big time.
So my point is this: A/B testing should probably be reserved to optimizing a design when you already know you’re in the right hill. If you want to create something truly new, you need to go crazy with your mutations. You need to work in solutions that are moving away from the hills you can see around you. Only when you find a new and interesting hill, should you apply strict methodologies like A/B testing to climb it.
