When you release something new—a feature, a web page, an email or anything else your users see—you should celebrate: by making a decision, you've made progress.
But as soon as you release, you need to answer another question: "How much progress have we made?" Every release, whether you set up a formal A/B test or not, is an experiment in which you're hoping to see that your idea, your hypothesis, really does add value to your business. By answering this question, you gauge whether the effort you put into the release was worth it.
In other words, every product and marketing release is an A/B test:
- A (the state before you made a change)
- B (the state after you made the change)
This kind of sequential A/B test is not statistically rigorous, but if you pick a metric—a goal against which you'll measure the effect of the change you made—it does give you a signal, and you will learn from it. When you run an "official" A/B test, you simply run the A and B experiences side-by-side, rather than in sequence. Running the different versions in parallel as a formal A/B test is valuable because you can then hold all other factors (like changes in your product, marketing or customer mix) constant and focus on whether the change you made produced a substantially different outcome for a specific metric.
So instead of "Should we run an A/B test?" the question to ask is "How much should we invest in this A/B test to learn what we need to learn?"
When Should You Run an "Official" A/B Test?
A/B tests are constructed on a spectrum: you make a tradeoff between less work and faster results versus more rigor and more certainty. As you move across the spectrum from speed toward rigor, you'll start to address questions like:
- Is this population the right mix to answer the question I want to answer? (does it reflect your audience?)
- How are participants being allocated to different test buckets/variants? (is it random so results aren't skewed?)
- Does each of the test buckets (A, B, C, etc.) isolate only the variables you care about? (or are you introducing other changes that will complicate the results?)
Once you've determined what you want to learn and the level of confidence you need to have about the results, make sure you have enough traffic to generate statistically significant results. While it's true that you need enough traffic to show that your changes are the cause of different user engagement levels rather than a random occurrence, you may not need as much traffic as you think!
How Much Traffic Do You Need For an A/B Test?
The volume of traffic you need depends on how large the differences are between your variants. Say you want to measure the impact of font choice in an email urging customers to upgrade their account; you're probably going to need to send a lot of emails to determine whether it makes a difference because the conversion rates for each variant will be very similar with such a small change.
Now imagine a much more drastic (and useful) A/B test: you could probably get results quickly by comparing a control group (which doesn't get a message at all) versus an email group (which gets a message). If the call-to-action is compelling, your conversion rates will be very different and you won't need much traffic to show a clear winner.
When you're small and you don't have many people in your product yet, focus on testing major changes, like whether or not to send a message at all in a certain situation or whether it should be an email or push notification. Big changes can reveal results quickly even with few participants. As you grow and gain more traffic to work with, you can fine-tune by testing smaller changes.
What Goal Should You Use for Your A/B Test?
Keep in mind: when you make a change in your product or marketing, many metrics may be affected. Here are two rules to keep in mind when picking what to track:
1) Pick a primary goal that matches the level of effort you're putting into the test. (There's a lot of A/B testing of subject lines in the email space, but if the goal you picked to determine a winner is a higher open rate, even a clear winner may have no effect on the actual action you need users to take deeper in your funnel to justify sending this email.) Keep in mind, once you set a goal, everything is on the table to achieve that goal. So make sure it's worth achieving!
2) Track secondary metrics that may enhance (or offset) your primary goal. Here's a classic landing page example: you set up a new page that converts very well because you promise the moon to those visitors... but your conversion rate to paid users is super low because they're disappointed: the gap between the promise and the reality put them off. By tracking these secondary metrics, you'll get visibility into how your changes affect the broader product experience.
Three Ways To Get More Out of Your A/B Tests
Optimizely has almost single-handedly driven the rise in A/B testing landing pages and, more recently, product flows. MailChimp and other tools have made it much easier to A/B test email subject lines. These tests can have an impact, but keep in mind that they optimize a relatively small piece of the customer experience (e.g. do they click the "Signup" button; do they open the email).
The real gains from A/B testing come when you tie changes you make in your product experience or messages to actions that are deep in your funnel: the purchases, upgrades and engagement metrics that drive your business.
Whether you have access to a crack data science team or not, A/B testing is a powerful tool for your business at every stage. Three things to remember:
- Broaden your definition of an A/B test: you're going to make a decision based on the information you have, so as you plan your release, decide what you need to learn.
- Learn to adapt your tests to the traffic volume you have by testing dramatic differences (like whether to send a message at all) first.
- Pick goals that really move the needle; that's the only way you'll know if your efforts were worth it.