Continuous Focusing Instead of Significance

February 15th, 2023

Other testing systems may wait for results to reach statistical significance.

We know you need to act quickly and the cost of changing an online ad is pretty small. So, we use a different type of statistics (Bayesian) so we can give you the probability each option is the best, even during your tests.

Statistical significance is a helpful concept for when change is hard. It assumes there is a present "standard" version, and then tries to be confident that something else is better before switching to it. It puts the burden of proof on showing that something is worth changing. This is essential for pharmaceuticals, switching food crops, or manufacturing processes.

Happily for us, changing online ads is easy. So we don't need to worry too much about proving that something is worth switching to. Instead, Deciding Data focuses on estimating the probability that each option is the best.

For this type of testing, we've had good results with a two-stage approach that gradually focuses.

Stage 1: Qualification round

These initial tests help us get a general sense of what is resonating. We generally start with small spending per variant (even below the CPA). This is often more informative about clicks because they are more frequent. Clicks aren't purchases or ROAS, so not what we care most about. But, it can give us a sense of what is in the running to be a higher-performing ad. Some ads will have such low CTR, that high performance for purchases is pretty unlikely. So, we can stop testing those ads early.

Stage 2: The contenders

In the second round, we want to spend at a level where we would expect several purchases. It is here that we get a better sense of which ads have the best ROAS or CPA.

Because we ruled out some ads before this point, we can also focus the budget down to fewer options. As a result, we don't always have to spend too much more overall to spend more per variant.

Running stages together

We've also been able to design the system so it can keep track of the different levels of evidence that we have for each ad. This means we can run ads that are in different stages, at the same time. Deciding Data will distribute the spending unevenly to account for this. More budget will go toward the likely contenders and less toward the ones still in the qualification round.