Guidelines

A/B Testing

Collect reliable information on real users’ behaviour with relatively little resources.

Back To Overview

Image by Storyset from Freepik

Test Phases

1. Study Design

2. Conduct Study

3. Study Evaluation

4. Implementation

Time: 3-5 days

Study Design
Planning & Recruiting

Defining the scope and purpose
Deciding on location and equipment/ Testing approach
Creating a testing guide with a task list and relevant question
Reach out with an incentive plan
Scheduling & Screening

Time: 1-3 days if moderated , several weeks if unmoderated

Conduct Study

Prepare materials and set up test environment
Motivating users to share their thoughts & take notes (if moderated)

Time: 1-2 days

Study Evaluation

Clean up the data & enrich with notes
Identify patterns of processes or problems across the testers
Prioritize problems discovered and share report

Time: ongoing

Implementation

Create a repository to track any changes on your prototype and the rationale for doing this
Check on defined criteria whether the changes you made had the desired impact

Study Design Planning & Recruiting

3-5 days

Defining the scope and purpose
Deciding on location and equipment/ Testing approach
Creating a testing guide with a task list and relevant questions
Reach out with an incentive plan
Scheduling & Screening

Learn More Below

Conducting the Test

1-3 days if moderated Several weeks if unmoderated

Prepare materials and set up test environment
Motivating users to share their thoughts & take notes (if moderated)

Analyzing & Reporting

1-2 days

Clean up the data & enrich with notes
Identify patterns of processes or problems across the testers
Prioritize problems discovered and share report

Implementation

...

Create a repository to track any changes you make on your prototype and what was the rationale for doing this

Study Design

Read here about the different options for your study – your research question should always guide you in chosing the right format, approach, and setting!

Start with your research question

1. In your research question, you define what you are looking for in the A/B test, including what data you need to answer your question:

Peoples’ thoughts and associations
Findability on your website
Usability for doing a specific tasks.

2. Depending on your research question, define exploratory (open-ended) or directed (answer-oriented, often with measurable success) tasks or metrics to measure during the A/B test

Examples

Exploratory task: Use the App to find advice that you could apply on your field.
Directed task: Change the language in which the information is shown to you in the App.
Metric: % of users that click on the redesigned button (when comparing 2 designs)

Go to Research Planning

How to implement your study

Ask yourself the following three questions to define the elements of your test:

Why Are You Testing?

Single vs. Multi Variant

What kind of insights are you seeking? Do you want to understand whether variant A or B performs better for the users? Or do you want to understand different elements of your product and their relationship to different headlines, button colors etc.?

Single

Multi

Business Illustrations by Storyset

Single Variant Testing

Test A against B (both new):
e.g. two text variants or design variants
Test new version against old:
e.g. one new feature variant of your existing product or service

Reading Links

Why it’s useful

Clear feedback on user behavior

Potential challenges

Data easily impacted by outside factors (season etc.)

Where Are You Testing?

In Person vs. Remote

Do you need to sit with the users and observe them? Or do you prefer to get behavioral insights from a lot of different, remote testing users?

In Person

Remote

People Illustrations by Storyset

In Person Testing

You meet your users physically for the testing – e.g. in your office/meeting room or where your users are
Is always moderated testing

Reading Links

Why it’s useful

A moderator can observe and record the user’s body language, gestures, and non-verbal cues

good for testing with people that have low access to internet/ digital skills

Potential challenges

requires more time, logistics and budget (also for compensation payments)

timelines need to meet users' availability

How Are You Testing?

Moderated vs. Unmoderated

Do you need to sit with the users and observe them? Or do you prefer to get behavioral insights from a lot of different, remote testing users?

Moderated

Unmoderated

Work Illustrations by Storyset

Moderated Testing

A real person facilitates (moderates) the testing, either physically or virtually
Can be done remotely or in person

Reading Links

Why it’s useful

The moderator can ask individual follow-up questions

The moderator can guide and support users during the testing (e.g. with a complex feature)

The moderator can observe body language and non-verbal cues

Potential challenges

Moderator might introduce bias

More time (logistics) and budget needed

Best Practices

Find here best practice examples with helpful tips and tricks.

Do’s and Don’ts

Do’s

Don’ts

Do’s

Create a strong hypothesis that you test and a goal to achieve with the test – How will you know that one variant worked better than the other?
Define the threshold for statistical validity for the results to be meaningful for you (Here is a tool to help you calculate it)
Make sure to control other influencing variables that might affect the validity of your results (e.g. seasonal variability)
Be ready to accept that your test failed, and your new idea is not improving your problem. If the result is "no difference", you might use the version you prefer.

Don’ts

Don't overuse this method, you can rely on expertise to identify which ideas are worth testing. Prior to A/B testing, do some observation and user interviews to identify crucial bottlenecks in your product and identify the best entry points for improvement. Then design alternatives and test them
Don't run ABC tests – test different parts or hypothesis after one another (multivariant tests are A/B - A/C tests)
Don't abort test before the necessary data is in because you think you see a tendency. Also, don't leave the test on forever to force a positive test result.

Potential Bias To Be Aware Of

Find a detailed overview of potential biases with counter actions here. Below a list of potential bias to be aware of when conducting A/B tests.

See Bias Overview

Image by Storyset from Freepik

The Recency Effect

People tend to give more weight to their most recent experiences. They form new opinions biased towards the latest news, e.g. by focusing only on the problems found in the latest usability session

Image by Storyset from Freepik

Anchoring Bias

When people make decisions, they tend to rely too heavily on one piece of information a trait that already exists. A famous example is from Henry Ford: “If I had asked people what they wanted, they would have said faster horses.”

Image by Storyset from Freepik

Social Desirability / Friendliness Bias

People tend to make more “socially acceptable” decisions when they are around other people. Same holds true for interviews, people want to make you feel good and will answer what they think you find pleasant and acceptable.

Image by Storyset from Freepik

The Hawthorne Effect

The very act of being observed can cause participants to change their behavior. The quality of observational data is heavily impacted by this.

Reading Recommendation

6 Essential Tips for Effective A/B Testing by Adobe XD

Define Stronger A/B Test Variations Through UX Research by NN Group

A/B Testing in UX Design: When and Why It’s Worth It by Adobe XD

Overview and ideas on what to test in A/B test by UX Design Institute

Overview and ideas on what to test in A/B test by Userzoom

Moderated vs. Unmoderated Tests by User Testing University

A/B Testing

Test Phases

Study Design Planning & Recruiting

Conducting the Test

Analyzing & Reporting

Implementation

Study Design

How to implement your study

Why Are You Testing?

Single

Multi

Single Variant Testing

Why it’s useful

Potential challenges

Multi Variant Testing

Why it’s useful

Potential challenges

Where Are You Testing?

In Person

Remote

In Person Testing

Why it’s useful

Potential challenges

Remote Testing

Why it’s useful

Potential challenges

How Are You Testing?

Moderated

Unmoderated

Moderated Testing

Why it’s useful

Potential challenges

Unmoderated testing

Why it’s useful

Potential challenges

Best Practices

Do’s and Don’ts

Potential Bias To Be Aware Of

Reading Recommendation