A Taxonomy of Cognitive Stress
I have been thinking about UI design lately. With some help from my
friend Rob Landley, I’ve come up with a classification schema for the
levels at which users are willing to invest effort to build
competence.
The base assumption is that for any given user there is a maximum
cognitive load any given user is willing to accept to use an
interface. I think that there are levels, analogous to Piagetian
developmental thresholds and possibly related to them, in the
trajectory of learning to use software interfaces.
Level 0: I’ll only push one button.
Level 1: I’ll push a sequence of buttons, as long as they’re all visible
and I don’t have to remember anything between presses. These people
can do checklists.
Level 2: I’m willing to push as sequence of buttons in which later ones may
not be visible until earlier ones have been pressed. These people
will follow pull-down menus; it’s OK for the display to change as long
as they can memorize the steps.
Level 3: I’m willing to use folders if they never change while I’m not looking.
There can be hidden unchanging state, but nothing must ever
happen out of sight. These people can handle an incremental replace
with confirmation. They can use macros, but have no capability to
cope with surprises other than by yelling for help.
Level 4: I’m willing to use metaphors to describe magic actions. A folder
can be described by “These are all my local machines” or “these
are all my print jobs” and is allowed to change out of sight in an
unsurprising way. These people can handle global replace, but must
examine the result to maintain confidence. These people will begin
customizing their environment.
Level 5: I’m willing to use categories (generalize about nouns). I’m
willing
to recognize that all .doc files are alike, or all .jpg files are
alike, and I have confidence there are sets of actions I can apply
to a file I have never seen that will work because I know its type.
(Late in this level knowledge begins to become articulate; these
people are willing to give simple instructions over the phone or
by email.)
Level 6: I’m willing to unpack metaphors into procedural steps. People at
this level begin to be able to cope with surprises when the
metaphor breaks, because they have a representation of process.
People at this level are ready to cope with the fact that HTML
documents are made up of tags, and more generally with
simple document markup.
Level 7: I’m willing to move between different representations of
a document or piece of data. People at this level know that
any one view of the data is not the same as the data, and lossless
transformations no longer scare them. Multiple representations
become more useful than confusing. At this level the idea of
structural rather than presentation markup begins to make sense.
Level 8: I’m willing to package simple procedures I already understand.
These people are willing to record a sequence of actions which
they understand into a macro, as long as no decisions (conditionals)
are involved. They begin to get comfortable with report generators.
At advanced level 8 they may start to be willing to deal with
simple SQL.
Level 9: I am willing to package procedures that make decisions, as long
as I already understand them. At his level, people begin to cope
with conditionals and loops, and also to deal with the idea of
programming languages.
Level 10: I am willing to problem-solve at the procedural level, writing
programs for tasks I don’t completely understand before
developing them.
I’m thinking this scale might be useful in classifying interfaces and
developing guidelines for not exceeding the pain threshold of an
audience if we have some model of what their notion of acceptable
cognitive load is.
(This is a spinoff from my book-in-progress, “The Art of Unix
Programming”, but I don’t plan to put it in the book.)
Comments, reactions, and refinements welcome.