Guidelines

Usability Testing

Q: Moderated In-person testing

To choose when: You need to learn about what problems people experience and why You are looking for qualitative data Use case examples You want to test a physical product or a design that needs some explanation. You want to understand what associations users might have. You want to see and capture people's mimic and gestures and get their direct feedback on two versions of a high-stake feature (critical for your innovation). Testing methodsCard SortingFocus GroupsField Observation

Q: Unmoderated remote testing

To choose when: You are looking for the answers to "how many?" or "how long?" You are looking for quantitative data Use case examples You want to test your prototype in the users' natural use context and want to measure indicators like: time on task, completion rate for tasks etc. You have a larger userbase already and want to test your website with a wide range of users, who might live far away. You want to run several tests simultaneously Testing methodsHeuristics EvaluationTree Testing

Q: Moderated remote testing

To choose when: You (or your observers) cannot afford to travel or make your users come to you You are looking for mostly qualitative data (or have budget to do a lot of moderated tests) Use case examples Your newly developed innovation works with data that might sensitive or confidential. Your user base has good access to the internet. You want to collect qualitative data or follow up on exploratory tasks with some interview questions. Your ideal test participants are geographically very diverse. Testing methodsTask Analysis

The gold standard of user research: Identify problems, compare interfaces, learn about the target users’ behavior and preferences

Back To Overview

Image by Storyset from Freepik

Test Phases

1. Planning

2. Testing

3. Analyzing

4. Implementation

Time: 3-5 days

Planning

Defining the scope and purpose
Decision on location and equipment/ Testing approach
Creating a testing guide with a task list and relevant questions
Reach out with an incentive plan
Scheduling & Screening

Usability Testing Plan

Time: 1-3 days if moderated , several weeks if unmoderated

Testing

Prepare materials and set up test environment
Motivating users to share their thoughts & take notes (if moderated)

See Testing Providers

Time: 1-2 days

Analyzing

Clean up the data & enrich with notes
Identify patterns of processes or problems across the testers
Prioritize problems discovered and share report

Time: ...

Implementation

Create a repository to track any changes you make on your prototype and what was the rationale for doing this

Planning

3-5 days

Defining the scope and purpose
Decision on location and equipment/ Testing approach
Creating a testing guide with a task list and relevant questions
Reach out with an incentive plan
Scheduling & Screening

Usability Testing Plan

Testing

1-3 days if moderated Several weeks if unmoderated

Prepare materials and set up test environment
Motivating users to share their thoughts & take notes (if moderated)

See Testing Providers

Analyzing

1-2 days

Clean up the data & enrich with notes
Identify patterns of processes or problems across the testers
Prioritize problems discovered and share report

Implementation

...

Create a repository to track any changes you make on your prototype and what was the rationale for doing this

Image by Aline Weinsheimer

Task Analysis

Image by Charlotte Schumann

Task Analysis

Usability Testing Types

Usability Testing is an umbrella term for a range of different approaches.

Start with your research question

1. In your research question, you define what you are looking for in the usability test, including what data you need to answer your question:

Peoples’ thoughts and associations
Findability on your website
Usability for doing a specific tasks.

2. Depending on your research question, define exploratory (open-ended) or directed (answer-oriented, often with measurable success) tasks or metrics to measure during the usability test

Examples

Exploratory task: Use the App to find advice that you could apply on your field.
Directed task: Change the language in which the information is shown to you in the App.
Metric: % of users that click on the redesigned button (when comparing 2 designs)

Go To Research Planning Module

Research question

What are usability issues with our IVR register process?

Do users find the diagnosis decision tree on our site? Is the information engaging?

Do people realize that we have a language switch button? If not, do they find the settings?

Example task

(Directed) You want to register with the new farmer hotline. Try to set up your profile using your phone.

(Exploratory) See if you can find actionable information on symptoms you recently found on your crop on this website.

(Directed) Try to change the language of the App.

Interesting follow up questions during the test (in addition to asking participants to think out loud):

“Do you have any comments about this activity?”
“Is there anything that you found especially difficult or very easy to do?”

How to implement your study

Ask yourself the following three questions to define the elements of your test:

Why Are You Testing?

Qualitative vs. Quantitative Usability Testing

What kind of insights are you seeking? Do you want to understand usability issues in discussing with and observing a small sample of users (qualitative measures)? Or do you want to know how many people manage to navigate through your service, or complete a predefined task (quantitative measures)? Usually both approached are combined for a thorough understanding.

Qualitative

Quantitative

Qualitative Testing

Recording user narratives of using the product with a non-representative sample
Usually moderated, in-person or remote testing
Analysis of data aims at minimizing bias (see below)
Results are narrative or descriptive and inform quantitative results with the "why"

Reading Links

Image by Storyset from Freepik

Why it’s useful

Understand motivations for usage patterns

Potential challenges

Danger of sample bias: insights cannot be generalized

Preparation takes quite some effort

Within

Between

Within-subject

The same participants test all existing variants of your product or service

e.g. for website evaluations: users test all features of the different website in randomized order
e.g. for assessing the learning curve of users

Reading Links

People Illustrations by Storyset

Why it’s useful

individual opinions/moods etc. will not affect your results

more data points with fewer participants

Potential challenges

longer sessions

Where Are You Testing?

In Person vs. Remote

Do you need to sit with the users and observe them? Or do you prefer to get behavioral insights from a lot of different, remote testing users?

In Person

Remote

People Illustrations by Storyset

In Person Testing

You meet your users physically for the testing – e.g. in your office/meeting room or where your users are
Is always moderated testing

Reading Links

Why it’s useful

A moderator can observe and record the user’s body language, gestures, and non-verbal cues

good for testing with people that have low access to internet/ digital skills

Potential challenges

requires more time, logistics and budget (also for compensation payments)

timelines need to meet users' availability

How Are You Testing?

Moderated vs. Unmoderated

In remote testing, do you want to moderate the testing – asking follow up questions, explaining the tasks? Or do you provide an easy testing setting that users everywhere can test whenever convenient?

Moderated

Unmoderated

Work Illustrations by Storyset

Moderated Testing

A real person facilitates (moderates) the testing, either physically or virtually
Can be done remotely or in person

Reading Links

Why it’s useful

The moderator can ask individual follow-up questions

The moderator can guide and support users during the testing (e.g. with a complex feature)

The moderator can observe body language and non-verbal cues

Potential challenges

Moderator might introduce bias

More time (logistics) and budget needed

Best Practices

Find here best practice examples with helpful tips and tricks.

Examples for common study types

You can of course always apply a mix, e.g. starting with qualitative, moderated tests early in your prototype development and then test the navigation and design usability of which you are more certain with a larger number of peoples.

A combination could be either a remote, unmoderated follow up to your moderated in-person test to quantify a specific usability issue you found or, the other way round, a moderated deep-dive on repeated errors people made in unmoderated remote tests to understand why this happens.

Moderated In-person testing





To choose when:

You need to learn about what problems people experience and why

You are looking for qualitative data

Use case examples

You want to test a physical product or a design that needs some explanation. You want to understand what associations users might have.

You want to see and capture people's mimic and gestures and get their direct feedback on two versions of a high-stake feature (critical for your innovation).

Testing methods
Card Sorting
Focus Groups
Field Observation

Unmoderated remote testing





To choose when:

You are looking for the answers to "how many?" or "how long?"

You are looking for quantitative data

Use case examples

You want to test your prototype in the users' natural use context and want to measure indicators like: time on task, completion rate for tasks etc.

You have a larger userbase already and want to test your website with a wide range of users, who might live far away.

You want to run several tests simultaneously

Testing methods
Heuristics Evaluation
Tree Testing

Moderated remote testing





To choose when:

You (or your observers) cannot afford to travel or make your users come to you

You are looking for mostly qualitative data (or have budget to do a lot of moderated tests)

Use case examples

Your newly developed innovation works with data that might sensitive or confidential. Your user base has good access to the internet.

You want to collect qualitative data or follow up on exploratory tasks with some interview questions. Your ideal test participants are geographically very diverse.

Testing methods
Task Analysis

Testing methods applied

Image by Aline Weinsheimer

Card Sorting in Tanzania

Image by Irma Ayes

Card Sorting Exercise

Image by Charlotte Schumann

User Testing in Rwanda

Image by Charlotte Schumann

Stats Unmoderated Test

Do’s and Don’ts

Do’s

Don’ts

Do’s

Test your test with a team member prior to asking potential users to spend their time giving you feedback.
Recruiting: For a qualitative usability test, 5-10 testers will probably generate enough data. Invite a few more since usually some will not show up.
If your userbase is diverse you might create sub tests for e.g. different age categories to get specific data and building inclusive products.
Compensation for your testers is good practice and will get you better show rates as well as better results.
Consider a non-disclosure agreement and consent form to get approval for recording and processing the data from the test.
During the test: Try to motivate your users to think out loud during the testing, e.g. with the methods described here.
For prioritization of the issues you discover, you can use severity scales or prioritization matrixes.

Don’ts

Don't test your innovation just before publication to validate it. Even digital innovations can already be tested with paper prototypes before they are programmed.
Don't try to test all of your product's functions at once. Organize each usability test around a specific question.
Don't try to cover all roles by yourself or with one person. A moderator can't take notes and moderate at the same time. Some services (e.g. MS Teams) offer auto-transcription services.
Timing: Don’t overuse your testers' patience. 60 minutes is a standard duration for testing sessions. Plan for breaks and refreshments and consider rather several shorter sessions than one long test.

Potential Bias To Be Aware Of

Find a detailed overview of potential biases with counter actions here. Below a list of potential bias to be aware of when running and analyzing Usability tests.

See Bias Overview

Image by Storyset from Freepik

Confirmation Bias

People tend to give more weight to evidence that confirms their assumptions and to discount data and opinions that don’t support those assumptions.

Image by Storyset from Freepik

The Recency Effect

People tend to give more weight to their most recent experiences. They form new opinions biased towards the latest news, e.g. by focusing only on the problems found in the latest usability session

Image by Storyset from Freepik

Anchoring Bias

When people make decisions, they tend to rely too heavily on one piece of information a trait that already exists. A famous example is from Henry Ford: “If I had asked people what they wanted, they would have said faster horses.”

Image by Storyset from Freepik

Social Desirability / Friendliness Bias

People tend to make more “socially acceptable” decisions when they are around other people. Same holds true for interviews, people want to make you feel good and will answer what they think you find pleasant and acceptable.

Image by Storyset from Freepik

The Hawthorne Effect

The very act of being observed can cause participants to change their behavior. The quality of observational data is heavily impacted by this.

Image by Storyset from Freepik

Image by Storyset from Freepik

Attribution Error

Tendency of people to overemphasize personal characteristics and ignore situational factors in judging others’ (or their own) behavior. E.g. user thinks they made a mistake – good user experience doesn’t “make you think” but helps you getting things done!

Reading Recommendation

Checklist for Moderating a Usability Test by NNGroup

Checklist for Planning a Usability Test by NNGroup

Write better qualitative usability tasks: Top 10 mistakes to avoid by NNGroup

Quantitative vs. Qualitative Usability Testing by NNGroup

Comparing Between and Within Subjects Studies by MeasuringU

How to analyze and report usability test results by Maze

Remote Testing Providers

Remote testing for prototypes or single messages by Maze

Website data and A/B testing by Google

Observing the same user over time. Provides the largest tester panel by Usertesting

Screensharing, good for observing behavior by dscout

Card sorting and tree testing functionalities by Userlytics

Card sorting, tree testing and click testing by Userzoom

Remote testing for websites, free trial includes 5 user sessions by Userbrain

Usability Testing

Test Phases

Planning

Testing

Analyzing

Implementation

Usability Testing Types

Research question

Example task

How to implement your study

Why Are You Testing?

Qualitative

Quantitative

Qualitative Testing

Quantitative Testing

Within

Between

Within-subject

Between-subject

Where Are You Testing?

In Person

Remote

In Person Testing

Remote Testing

How Are You Testing?

Moderated

Unmoderated

Moderated Testing

Unmoderated testing

Best Practices

Examples for common study types

Moderated In-person testing

Unmoderated remote testing

Moderated remote testing

Testing methods applied

Do’s and Don’ts

Potential Bias To Be Aware Of

Reading Recommendation

Remote Testing Providers