Guidelines

Usability Testing

The gold standard of user research​: Identify problems, compare interfaces, learn about the target users’ behavior and preferences

Image by Storyset from Freepik

Test Phases

Time: 3-5 days

Planning

  • Defining the scope and purpose
  • Decision on location and equipment/ Testing approach
  • Creating a testing guide with a task list and relevant questions
  • Reach out with an incentive plan
  • Scheduling & Screening

Time: 1-3 days if moderated , several weeks if unmoderated

Testing

  • Prepare materials and set up test environment
  • Motivating users to share their thoughts & take notes (if moderated)

Time: 1-2 days

Analyzing

  • Clean up the data & enrich with notes
  • Identify patterns of processes or problems across the testers
  • Prioritize problems discovered and share report

Time: ...

Implementation

  • Create a repository to track any changes you make on your prototype and what was the rationale for doing this

Planning

3-5 days

  • Defining the scope and purpose
  • Decision on location and equipment/ Testing approach
  • Creating a testing guide with a task list and relevant questions
  • Reach out with an incentive plan
  • Scheduling & Screening

Testing

1-3 days if moderated Several weeks if unmoderated

  • Prepare materials and set up test environment
  • Motivating users to share their thoughts & take notes (if moderated)

Analyzing

1-2 days

  • Clean up the data & enrich with notes
  • Identify patterns of processes or problems across the testers
  • Prioritize problems discovered and share report

Implementation

...

  • Create a repository to track any changes you make on your prototype and what was the rationale for doing this

Image by Aline Weinsheimer
Task Analysis
Image by Charlotte Schumann
Task Analysis

Usability Testing Types

Usability Testing is an umbrella term for a range of different approaches. ​

Start with your research question

1. In your research question, you define what you are looking for in the A/B test, including what data you need to answer your question:

  • Peoples’ thoughts and associations
  • Findability on your website​
  • Usability for doing a specific tasks. ​

2. Depending on your research question, define exploratory (open-ended) or directed (answer-oriented, often with measurable success) tasks or metrics to measure during the A/B test

Examples

  • Exploratory task: Use the App to find advice that you could apply on your field.​
  • Directed task: Change the language in which the information is shown to you in the App.​
  • Metric: % of users that click on the redesigned button (when comparing 2 designs)

Go To Research Planning Module

Research question

What are usability issues with our IVR register process?​

Do users find the diagnosis decision tree on our site? Is the information engaging?​

Do people realize that we have a language switch button? If not, do they find the settings?​

Example task

(Directed) You want to register with the new farmer hotline. Try to set up your profile using your phone.​

(Exploratory) See if you can find actionable information on symptoms you recently found on your crop on this website.​

(Directed) Try to change the language of the App.​

Interesting follow up questions during the test (in addition to asking participants to think out loud):​

  • “Do you have any comments about this activity?”​
  • “Is there anything that you found especially difficult or very easy to do?​”

How to implement your study

Ask yourself the following three questions to define the elements of your test:

Why Are You Testing?

Qualitative vs. Quantitative Usability Testing

What kind of insights are you seeking? Do you want to understand usability issues in discussing with and observing a small sample of users (qualitative measures)? Or do you want to know how many people manage to navigate through your service, or complete a predefined task (quantitative measures)?​ Usually both approached are combined for a thorough understanding.

Qualitative Testing

  • Recording user narratives of using the product with a non-representative sample​
  • Usually moderated, in-person or remote testing​
    Analysis of data aims at minimizing bias (see below)​
  • Results are narrative or descriptive and inform quantitative results with the "why"​

Image by Storyset from Freepik

Why it’s useful

Understand motivations for usage patterns​

 

Potential challenges

Danger of sample bias: insights cannot be generalized​

Preparation takes quite some effort​

Within-subject

  • The same participants test all existing variants of your product or service​

e.g. for website evaluations: users test all features of the different website in randomized order​
e.g. for assessing the learning curve of users​

Why it’s useful

individual opinions/moods etc. will not affect your results​

more data points with fewer participants​

 

Potential challenges

longer sessions

Where Are You Testing?

In Person vs. Remote

Do you need to sit with the users and observe them? Or do you prefer to get behavioral insights from a lot of different, remote testing users?​

In Person Testing

  • You meet your users physically for the testing – e.g. in your office/meeting room or where your users are​
  • Is always moderated testing​

Why it’s useful

A moderator can observe and record the user’s body language, gestures, and non-verbal cues

good for testing with people that have low access to internet/ digital skills

Potential challenges

requires more time, logistics and budget (also for compensation payments)

timelines need to meet users' availability

How Are You Testing?

Moderated vs. Unmoderated

In remote testing, do you want to moderate the testing – asking follow up questions, explaining the tasks? Or do you provide an easy testing setting that users everywhere can test whenever convenient?​

Moderated Testing

  • A real person facilitates (moderates) the testing, either physically or virtually​
  • Can be done remotely or in person​

Why it’s useful

The moderator can ask individual follow-up questions

The moderator can guide and support users during the testing (e.g. with a complex feature)

The moderator can observe body language and non-verbal cues

Potential challenges

Moderator might introduce bias

More time (logistics) and budget needed

Best Practices

Find here best practice examples with helpful tips and tricks.

Examples for common study types

You can of course always apply a mix, e.g. starting with qualitative, moderated tests early in your prototype development and then test the navigation and design usability of which you are more certain with a larger number of peoples.​

A combination could be either a remote, unmoderated follow up to your moderated in-person test to quantify a specific usability issue you found or, the other way round, a moderated deep-dive on repeated errors people made in unmoderated remote tests to understand why this happens.​

Moderated In-person testing

To choose when:

You need to learn about what problems people experience and why

You are looking for qualitative data

Use case examples

You want to test a physical product or a design that needs some explanation. You want to understand what associations users might have.

You want to see and capture people's mimic and gestures and get their direct feedback on two versions of a high-stake feature (critical for your innovation).

Testing methods
Card Sorting
Focus Groups
Field Observation

Unmoderated remote testing

To choose when:

You are looking for the answers to "how many?" or "how long?"

You are looking for quantitative data

Use case examples

You want to test your prototype in the users' natural use context and want to measure indicators like: time on task, completion rate for tasks etc.

You have a larger userbase already and want to test your website with a wide range of users, who might live far away.

You want to run several tests simultaneously

Testing methods
Heuristics Evaluation
Tree Testing

Moderated remote testing

To choose when:

You (or your observers) cannot afford to travel or make your users come to you

You are looking for mostly qualitative data (or have budget to do a lot of moderated tests)

Use case examples

Your newly developed innovation works with data that might sensitive or confidential. Your user base has good access to the internet.

You want to collect qualitative data or follow up on exploratory tasks with some interview questions. Your ideal test participants are geographically very diverse.

Testing methods
Task Analysis

Testing methods applied

Image by Aline Weinsheimer
Card Sorting in Tanzania
Image by Irma Ayes
Card Sorting Exercise
Image by Charlotte Schumann
User Testing in Rwanda
Image by Charlotte Schumann
Stats Unmoderated Test

Do’s and Don’ts

Do’s

  • Test your test with a team member prior to asking potential users to spend their time giving you feedback.
  • Recruiting: For a qualitative usability test, 5-10 testers will probably generate enough data. Invite a few more since usually some will not show up.
  • If your userbase is diverse you might create sub tests for e.g. different age categories to get specific data and building inclusive products.
  • Compensation for your testers is good practice and will get you better show rates as well as better results.
  • Consider a non-disclosure agreement and consent form to get approval for recording and processing the data from the test.
  • During the test: Try to motivate your users to think out loud during the testing, e.g. with the methods described here.
  • For prioritization of the issues you discover, you can use severity scales or prioritization matrixes.

Don’ts

  • Don't test your innovation just before publication to validate it. Even digital innovations can already be tested with paper prototypes before they are programmed.
  • Don't try to test all of your product's functions at once. Organize each usability test around a specific question.
  • Don't try to cover all roles by yourself or with one person. A moderator can't take notes and moderate at the same time. Some services (e.g. MS Teams) offer auto-transcription services.
  • Timing: Don’t overuse your testers' patience. 60 minutes is a standard duration for testing sessions. Plan for breaks and refreshments and consider rather several shorter sessions than one long test.

Potential Bias To Be Aware Of

Find a detailed overview of potential biases with counter actions here. Below a list of potential bias to be aware of when running and analyzing Usability  tests.

Image by Storyset from Freepik

Confirmation Bias

People tend to give more weight to evidence that confirms their assumptions and to discount data and opinions that don’t support those assumptions. ​

Image by Storyset from Freepik

Image by Storyset from Freepik

The Recency Effect

People tend to give more weight to their most recent experiences. They form new opinions biased towards the latest news, e.g. by focusing only on the problems found in the latest usability session ​

Image by Storyset from Freepik

Anchoring Bias​

When people make decisions, they tend to rely too heavily on one piece of information a trait that already exists. A famous example is from Henry Ford: “If I had asked people what they wanted, they would have said faster horses.” ​

Image by Storyset from Freepik

Image by Storyset from Freepik

Social Desirability / Friendliness Bias

People tend to make more “socially acceptable” decisions when they are around other people. Same holds true for interviews, people want to make you feel good and will answer what they think you find pleasant and acceptable. ​

Image by Storyset from Freepik

The Hawthorne Effect​

The very act of being observed can cause participants to change their behavior​. The quality of observational data is heavily impacted by this.​

Image by Storyset from Freepik

Image by Storyset from Freepik

Image by Storyset from Freepik

Attribution Error​

Tendency of people to overemphasize personal characteristics and ignore situational factors in judging others’ (or their own) behavior. E.g. user thinks they made a mistake – good user experience doesn’t “make you think” but helps you getting things done! ​

Reading Recommendation

Checklist for Moderating a Usability Test​ by NNGroup

Checklist for Planning a Usability Test​ by NNGroup

Write better qualitative usability tasks: Top 10 mistakes to avoid​ by NNGroup

Quantitative vs. Qualitative Usability Testing​ by NNGroup

Comparing Between and Within Subjects Studies​ by MeasuringU

How to analyze and report usability test results​ by Maze

Remote Testing Providers