Guidelines
Usability Testing
Image by Storyset from Freepik
Test Phases
Time: 3-5 days
Planning
- Defining the scope and purpose
- Decision on location and equipment/ Testing approach
- Creating a testing guide with a task list and relevant questions
- Reach out with an incentive plan
- Scheduling & Screening
Time: 1-3 days if moderated , several weeks if unmoderated
Testing
- Prepare materials and set up test environment
- Motivating users to share their thoughts & take notes (if moderated)
Time: 1-2 days
Analyzing
- Clean up the data & enrich with notes
- Identify patterns of processes or problems across the testers
- Prioritize problems discovered and share report
Time: ...
Implementation
- Create a repository to track any changes you make on your prototype and what was the rationale for doing this
Planning
3-5 days
- Defining the scope and purpose
- Decision on location and equipment/ Testing approach
- Creating a testing guide with a task list and relevant questions
- Reach out with an incentive plan
- Scheduling & Screening
Testing
1-3 days if moderated Several weeks if unmoderated
- Prepare materials and set up test environment
- Motivating users to share their thoughts & take notes (if moderated)
Analyzing
1-2 days
- Clean up the data & enrich with notes
- Identify patterns of processes or problems across the testers
- Prioritize problems discovered and share report
Implementation
...
- Create a repository to track any changes you make on your prototype and what was the rationale for doing this
Usability Testing Types
Start with your research question
1. In your research question, you define what you are looking for in the usability test, including what data you need to answer your question:
- Peoples’ thoughts and associations
- Findability on your website
- Usability for doing a specific tasks.
2. Depending on your research question, define exploratory (open-ended) or directed (answer-oriented, often with measurable success) tasks or metrics to measure during the usability test
Examples
- Exploratory task: Use the App to find advice that you could apply on your field.
- Directed task: Change the language in which the information is shown to you in the App.
- Metric: % of users that click on the redesigned button (when comparing 2 designs)
Go To Research Planning Module
Research question
What are usability issues with our IVR register process?
Do users find the diagnosis decision tree on our site? Is the information engaging?
Do people realize that we have a language switch button? If not, do they find the settings?
Example task
(Directed) You want to register with the new farmer hotline. Try to set up your profile using your phone.
(Exploratory) See if you can find actionable information on symptoms you recently found on your crop on this website.
(Directed) Try to change the language of the App.
Interesting follow up questions during the test (in addition to asking participants to think out loud):
- “Do you have any comments about this activity?”
- “Is there anything that you found especially difficult or very easy to do?”
How to implement your study
Ask yourself the following three questions to define the elements of your test:
Why Are You Testing?
Qualitative vs. Quantitative Usability Testing
What kind of insights are you seeking? Do you want to understand usability issues in discussing with and observing a small sample of users (qualitative measures)? Or do you want to know how many people manage to navigate through your service, or complete a predefined task (quantitative measures)? Usually both approached are combined for a thorough understanding.
Qualitative Testing
- Recording user narratives of using the product with a non-representative sample
- Usually moderated, in-person or remote testing
Analysis of data aims at minimizing bias (see below) - Results are narrative or descriptive and inform quantitative results with the "why"
Image by Storyset from Freepik
Why it’s useful
Understand motivations for usage patterns
Potential challenges
Danger of sample bias: insights cannot be generalized
Preparation takes quite some effort
Within-subject
- The same participants test all existing variants of your product or service
e.g. for website evaluations: users test all features of the different website in randomized order
e.g. for assessing the learning curve of users
Why it’s useful
individual opinions/moods etc. will not affect your results
more data points with fewer participants
Potential challenges
longer sessions
Where Are You Testing?
Do you need to sit with the users and observe them? Or do you prefer to get behavioral insights from a lot of different, remote testing users?
In Person Testing
- You meet your users physically for the testing – e.g. in your office/meeting room or where your users are
- Is always moderated testing
Why it’s useful
A moderator can observe and record the user’s body language, gestures, and non-verbal cues
good for testing with people that have low access to internet/ digital skills
Potential challenges
requires more time, logistics and budget (also for compensation payments)
timelines need to meet users' availability
How Are You Testing?
In remote testing, do you want to moderate the testing – asking follow up questions, explaining the tasks? Or do you provide an easy testing setting that users everywhere can test whenever convenient?
Moderated Testing
- A real person facilitates (moderates) the testing, either physically or virtually
- Can be done remotely or in person
Why it’s useful
The moderator can ask individual follow-up questions
The moderator can guide and support users during the testing (e.g. with a complex feature)
The moderator can observe body language and non-verbal cues
Potential challenges
Moderator might introduce bias
More time (logistics) and budget needed
Best Practices
Find here best practice examples with helpful tips and tricks.
Examples for common study types
You can of course always apply a mix, e.g. starting with qualitative, moderated tests early in your prototype development and then test the navigation and design usability of which you are more certain with a larger number of peoples.
A combination could be either a remote, unmoderated follow up to your moderated in-person test to quantify a specific usability issue you found or, the other way round, a moderated deep-dive on repeated errors people made in unmoderated remote tests to understand why this happens.
Moderated In-person testing
To choose when:
You need to learn about what problems people experience and why
You are looking for qualitative data
Use case examples
You want to test a physical product or a design that needs some explanation. You want to understand what associations users might have.
You want to see and capture people's mimic and gestures and get their direct feedback on two versions of a high-stake feature (critical for your innovation).
Testing methods
Card Sorting
Focus Groups
Field Observation
Unmoderated remote testing
To choose when:
You are looking for the answers to "how many?" or "how long?"
You are looking for quantitative data
Use case examples
You want to test your prototype in the users' natural use context and want to measure indicators like: time on task, completion rate for tasks etc.
You have a larger userbase already and want to test your website with a wide range of users, who might live far away.
You want to run several tests simultaneously
Testing methods
Heuristics Evaluation
Tree Testing
Moderated remote testing
To choose when:
You (or your observers) cannot afford to travel or make your users come to you
You are looking for mostly qualitative data (or have budget to do a lot of moderated tests)
Use case examples
Your newly developed innovation works with data that might sensitive or confidential. Your user base has good access to the internet.
You want to collect qualitative data or follow up on exploratory tasks with some interview questions. Your ideal test participants are geographically very diverse.
Testing methods
Task Analysis
Testing methods applied
Do’s and Don’ts
Do’s
- Test your test with a team member prior to asking potential users to spend their time giving you feedback.
- Recruiting: For a qualitative usability test, 5-10 testers will probably generate enough data. Invite a few more since usually some will not show up.
- If your userbase is diverse you might create sub tests for e.g. different age categories to get specific data and building inclusive products.
- Compensation for your testers is good practice and will get you better show rates as well as better results.
- Consider a non-disclosure agreement and consent form to get approval for recording and processing the data from the test.
- During the test: Try to motivate your users to think out loud during the testing, e.g. with the methods described here.
- For prioritization of the issues you discover, you can use severity scales or prioritization matrixes.
Don’ts
- Don't test your innovation just before publication to validate it. Even digital innovations can already be tested with paper prototypes before they are programmed.
- Don't try to test all of your product's functions at once. Organize each usability test around a specific question.
- Don't try to cover all roles by yourself or with one person. A moderator can't take notes and moderate at the same time. Some services (e.g. MS Teams) offer auto-transcription services.
- Timing: Don’t overuse your testers' patience. 60 minutes is a standard duration for testing sessions. Plan for breaks and refreshments and consider rather several shorter sessions than one long test.
Potential Bias To Be Aware Of
Find a detailed overview of potential biases with counter actions here. Below a list of potential bias to be aware of when running and analyzing Usability tests.
Image by Storyset from Freepik
Confirmation Bias
People tend to give more weight to evidence that confirms their assumptions and to discount data and opinions that don’t support those assumptions.
Image by Storyset from Freepik
Image by Storyset from Freepik
The Recency Effect
People tend to give more weight to their most recent experiences. They form new opinions biased towards the latest news, e.g. by focusing only on the problems found in the latest usability session
Image by Storyset from Freepik
Anchoring Bias
When people make decisions, they tend to rely too heavily on one piece of information a trait that already exists. A famous example is from Henry Ford: “If I had asked people what they wanted, they would have said faster horses.”
Image by Storyset from Freepik
Image by Storyset from Freepik
Social Desirability / Friendliness Bias
People tend to make more “socially acceptable” decisions when they are around other people. Same holds true for interviews, people want to make you feel good and will answer what they think you find pleasant and acceptable.
Image by Storyset from Freepik
The Hawthorne Effect
The very act of being observed can cause participants to change their behavior. The quality of observational data is heavily impacted by this.
Image by Storyset from Freepik
Image by Storyset from Freepik
Image by Storyset from Freepik
Tendency of people to overemphasize personal characteristics and ignore situational factors in judging others’ (or their own) behavior. E.g. user thinks they made a mistake – good user experience doesn’t “make you think” but helps you getting things done!
Reading Recommendation
Checklist for Moderating a Usability Test by NNGroup
Checklist for Planning a Usability Test by NNGroup
Write better qualitative usability tasks: Top 10 mistakes to avoid by NNGroup
Quantitative vs. Qualitative Usability Testing by NNGroup
Comparing Between and Within Subjects Studies by MeasuringU
How to analyze and report usability test results by Maze
Remote Testing Providers
Remote testing for prototypes or single messages by Maze
Website data and A/B testing by Google
Observing the same user over time. Provides the largest tester panel by Usertesting
Screensharing, good for observing behavior by dscout
Card sorting and tree testing functionalities by Userlytics
Card sorting, tree testing and click testing by Userzoom
Remote testing for websites, free trial includes 5 user sessions by Userbrain