views:

55

answers:

2

Just wondering if anyone out there knows of a standard survey (preferably based off Jacob Nielson's work on usability) that web admin's can administer to test groups for usability?

I could just make up my own but I feel there as got to be some solid research out there on the sort of judgments on tasks I should be asking.

For example

Q:: Ask user to find profile page Do I ... A.) Present them with standard likert scale after each question B.) Present them the likert after all the questions

.. Then what should that likert be, I know Nielson's usability judgments scale is based on Learnability, Efficiency of Use, Memorability, Error Rate, Satisfaction but I can only imagine a likert I would design that would effectively measure satisfaction...how am I suppose to ask a user to rank the Memorability of a site after one use on a 1-5 scale? Surely someone has devised a good way to pose the question?

+2  A: 

A few recommendations:

  1. Don't determine your standard exclusively by listening to the users and waiting for their feedback. Nielsen says that rule #1 in usability is "Don't listen to users"; it's more important to watch them work.

  2. Here is an FAQ regarding development of Likert questionnaires. I would err on the side of simplicity and brevity if you are going to ask users a list of questions after every task. There are advantages and disadvantages to both of the options you are considering. If you make a user wait until they have finished all of their tasks before they fill out a survey, they may not remember their initial difficulties with the interface as they adjust to its learning curve. On the other hand, if you ask them questions after each task, they may start rushing through the questionnaire as they get toward the end of the list of tasks. An extra option, depending on how many tasks you have, may be to have the user fill out a survey after every several tasks.

  3. The University of Maryland HCI Laboratory maintains a Questionnaire for User Interaction Satisfaction, which is available for download and now on version 7.0. You may be able to use their survey, or at least tailor it for your use.

David
A: 

The short and easy System Usability Scale (SUS) has been found by Tullis and Stetson (2004) to psychometrically outperform other subjective scales including the renowned QUIS. Most SUS items seem related to learnability or memorability, along with a couple for efficiency. However, I wouldn’t try to break it into subscales; all items are highly intercorrelated suggesting this scale measures a single underlying construct.

I would doubt you can get a scale to measure each of Nielsen’s dimensions separately. A user can tell you if a product is “hard” to use, but it’s much more difficult for them to break it down further. They know it took a lot of work to do something, but was it because they couldn’t figure out an easier way (learnability)? Or maybe they had learned a better way on a previous task, but forgot it (memorability)? Or is that just the way it has to be (efficiency)? Users are not going to have sufficient information to make the distinction.

If you are specifically interested in each of Nielsen’s dimensions separately, then assess them separately and directly. You can measure learnability crudely through recording the number of errors or time between clicks, and precisely by how many trials it takes for users to learn the normative interaction sequence. For efficiency, after you train users to do the normative interaction sequence, record how long it takes them to do it. You can also get a pretty good answer analytically using something like GOMS-KLM. For memorability, bring the same users in a week or so later and compare their performance to that of the efficiency-measuring trial.

Like nearly all subjective scales, the SUS is primarily useful for comparing the overall subjective experience of different products. It’s hard to know what to make out of a single score without something to compare it to. These scales won’t tell what specific problems a product has or why it has them (e.g., to help you determine improvements). For that, qualitative observation and debriefing your test participants is best.

Michael Zuschlag